Data Engineer
Apply NowCompany: Codeforce360
Location: Columbus, IN 47201
Description:
Required Skills:
Data Analysis Knowledge of the latest technologies in data engineering is highly preferred and includes:
Job Description:
Responsibilities:
Skills:
Knowledge of the latest technologies in data engineering is highly preferred and includes:
Data Analysis Knowledge of the latest technologies in data engineering is highly preferred and includes:
- Exposure to Big Data open source Hands on experience in Spark Structured Streaming & API workflow SPARK, Scala/Java, Map-Reduce, Hive, Hbase, and Kafka, Microsoft Azure Databricks SQL query language.
- Clustered compute cloud-based implementation experience.
- Familiarity developing applications requiring large file movement for a Cloud-based environment.
- Exposure to Agile software development.
- Exposure to building analytical solutions.
- Exposure to IoT technology.
Job Description:
- Supports, develops and maintains a data and analytics platform.
- Effectively and efficiently process, store and make data available to analysts and other consumers.
- Works with the Business and IT teams to understand the requirements to best leverage the technologies to enable agile data delivery at scale.
Responsibilities:
- Implements and automates deployment of our distributed system for ingesting and transforming data from various types of sources (relational, event-based, unstructured).
- Designs and implements Spark Structured Streaming & API workflow
- Implements methods to continuously monitor and troubleshoot data quality and data integrity issues.
- Implements data governance processes and methods for managing metadata, access, retention to data for internal and external users.
- Develops reliable, efficient, scalable, and quality data pipelines with monitoring and alert mechanisms that combine a variety of sources using ETL/ELT tools or scripting languages.
- Develops physical data models and implements data storage architectures as per design guidelines.
- Analyzes complex data elements and systems, data flow, dependencies, and relationships in order to contribute to conceptual physical and logical data models.
- Participates in testing and troubleshooting of data pipelines.
- Develops and operates large scale data storage and processing solutions using different distributed and cloud based platforms for storing data (e.g. Data Lakes, Hadoop, Hbase, Cassandra, MongoDB, Accumulo, DynamoDB, others).
- Uses agile development technologies, such as DevOps, Scrum, Kanban and continuous improvement cycle, for data driven application, attends daily stand-ups.
Skills:
Knowledge of the latest technologies in data engineering is highly preferred and includes:
- Exposure to Big Data open source.
- Hands on experience in Spark Structured Streaming & API workflow.
- SPARK, Scala/Java, Map-Reduce, Hive, Hbase, and Kafka, Microsoft Azure Databricks.
- SQL query language.
- Clustered compute cloud-based implementation experience.
- Familiarity developing applications requiring large file movement for a Cloud-based environment.
- Exposure to Agile software development.
- Exposure to building analytical solutions.
- Exposure to IoT technology.