Data Engineer

Apply Now

Company: Codeforce360

Location: Columbus, IN 47201

Description:

Required Skills:

Data Analysis Knowledge of the latest technologies in data engineering is highly preferred and includes:
  • Exposure to Big Data open source Hands on experience in Spark Structured Streaming & API workflow SPARK, Scala/Java, Map-Reduce, Hive, Hbase, and Kafka, Microsoft Azure Databricks SQL query language.
  • Clustered compute cloud-based implementation experience.
  • Familiarity developing applications requiring large file movement for a Cloud-based environment.
  • Exposure to Agile software development.
  • Exposure to building analytical solutions.
  • Exposure to IoT technology.

Job Description:
  • Supports, develops and maintains a data and analytics platform.
  • Effectively and efficiently process, store and make data available to analysts and other consumers.
  • Works with the Business and IT teams to understand the requirements to best leverage the technologies to enable agile data delivery at scale.

Responsibilities:
  • Implements and automates deployment of our distributed system for ingesting and transforming data from various types of sources (relational, event-based, unstructured).
  • Designs and implements Spark Structured Streaming & API workflow
  • Implements methods to continuously monitor and troubleshoot data quality and data integrity issues.
  • Implements data governance processes and methods for managing metadata, access, retention to data for internal and external users.
  • Develops reliable, efficient, scalable, and quality data pipelines with monitoring and alert mechanisms that combine a variety of sources using ETL/ELT tools or scripting languages.
  • Develops physical data models and implements data storage architectures as per design guidelines.
  • Analyzes complex data elements and systems, data flow, dependencies, and relationships in order to contribute to conceptual physical and logical data models.
  • Participates in testing and troubleshooting of data pipelines.
  • Develops and operates large scale data storage and processing solutions using different distributed and cloud based platforms for storing data (e.g. Data Lakes, Hadoop, Hbase, Cassandra, MongoDB, Accumulo, DynamoDB, others).
  • Uses agile development technologies, such as DevOps, Scrum, Kanban and continuous improvement cycle, for data driven application, attends daily stand-ups.

Skills:

Knowledge of the latest technologies in data engineering is highly preferred and includes:
  • Exposure to Big Data open source.
  • Hands on experience in Spark Structured Streaming & API workflow.
  • SPARK, Scala/Java, Map-Reduce, Hive, Hbase, and Kafka, Microsoft Azure Databricks.
  • SQL query language.
  • Clustered compute cloud-based implementation experience.
  • Familiarity developing applications requiring large file movement for a Cloud-based environment.
  • Exposure to Agile software development.
  • Exposure to building analytical solutions.
  • Exposure to IoT technology.

Similar Jobs