Data Engineer

Apply Now

Company: Codeforce360

Location: Columbus, IN 47201

Description:

Required Skills:

Data Analysis Knowledge of the latest technologies in data engineering is highly preferred and includes:

Exposure to Big Data open source Hands on experience in Spark Structured Streaming & API workflow SPARK, Scala/Java, Map-Reduce, Hive, Hbase, and Kafka, Microsoft Azure Databricks SQL query language.
Clustered compute cloud-based implementation experience.
Familiarity developing applications requiring large file movement for a Cloud-based environment.
Exposure to Agile software development.
Exposure to building analytical solutions.
Exposure to IoT technology.

Job Description:

Supports, develops and maintains a data and analytics platform.
Effectively and efficiently process, store and make data available to analysts and other consumers.
Works with the Business and IT teams to understand the requirements to best leverage the technologies to enable agile data delivery at scale.

Responsibilities:

Implements and automates deployment of our distributed system for ingesting and transforming data from various types of sources (relational, event-based, unstructured).
Designs and implements Spark Structured Streaming & API workflow
Implements methods to continuously monitor and troubleshoot data quality and data integrity issues.
Implements data governance processes and methods for managing metadata, access, retention to data for internal and external users.
Develops reliable, efficient, scalable, and quality data pipelines with monitoring and alert mechanisms that combine a variety of sources using ETL/ELT tools or scripting languages.
Develops physical data models and implements data storage architectures as per design guidelines.
Analyzes complex data elements and systems, data flow, dependencies, and relationships in order to contribute to conceptual physical and logical data models.
Participates in testing and troubleshooting of data pipelines.
Develops and operates large scale data storage and processing solutions using different distributed and cloud based platforms for storing data (e.g. Data Lakes, Hadoop, Hbase, Cassandra, MongoDB, Accumulo, DynamoDB, others).
Uses agile development technologies, such as DevOps, Scrum, Kanban and continuous improvement cycle, for data driven application, attends daily stand-ups.

Skills:

Knowledge of the latest technologies in data engineering is highly preferred and includes:

Exposure to Big Data open source.
Hands on experience in Spark Structured Streaming & API workflow.
SPARK, Scala/Java, Map-Reduce, Hive, Hbase, and Kafka, Microsoft Azure Databricks.
SQL query language.
Clustered compute cloud-based implementation experience.
Familiarity developing applications requiring large file movement for a Cloud-based environment.
Exposure to Agile software development.
Exposure to building analytical solutions.
Exposure to IoT technology.