Data Engineer

Apply Now

Company: Futran Tech Solutions Pvt. Ltd.

Location: Austin, TX 78745

Description:

Role Name:Data Engineer - Data & Reporting

Location:Austin or Bay Area

JD:

This role will build solutions that integrate open source software with A****'s internal ecosystem. You will drive development of new components and features from concept to release: design, build, test, and ship at a regular cadence. You will work closely with internal customers to understand their requirements and workflows, and propose new features and ecosystem changes to streamline their experience of using the solutions on our platform.

This is a challenging engineering role, where a large part of an engineer's time is spent in writing code and designing/developing applications on cloud, with the remainder being spent on tuning and debugging codebase, supporting production applications and supporting our application end users. This role requires in-depth knowledge of innovative technologies and cloud data platform with the ability to independently learn new technologies and contribute to the success of various initiatives.

Responsibilities:
Build and Optimize Data Pipelines: Analyze and organize raw data from various internal and external sources. Design, develop, and manage scalable, high-performance and resilient data pipelines that process, transform, and make data accessible for real-time analytics and reporting.

Data Infrastructure Development: Design, build, and maintain data infrastructure to ensure data is reliable, accessible, and secure for various applications and stakeholders. Implement lake-house architecture.

Data Integration: Analyze business use case, perform data profiling and implement data transformations required for curating raw data into AI/ML and BI ready data.

Streaming & Batch Processing: Build streaming and batch data pipelines that are reusable, fault-tolerant and scalable.

Data Quality & Governance: Implement data integrity validations and data quality checks for ensuring accuracy, consistency, and completeness across all datasets.

Collaboration & Cross-Functional Partnerships: Collaborate with platform engineers, data analysts, platform engineers and business stakeholders to define and implement data requirements and deliver end-to-end solutions.

Data Warehouse Design: Design, implement, and optimize data lake, data warehouse and data mart to meet analytics and reporting needs.

Innovation & Automation: Identify opportunities to automate data workflows, streamline processes, and introduce best-in-class tools and frameworks that enhance productivity and efficiency.

Performance Tuning: Continuously monitor and tune data pipelines and infrastructure for optimal performance, scalability, and cost-efficiency.

Qualifications

Experience: 10 or more years of experience building and maintaining enterprise-level data applications on distributed systems at scale.

Programming Skills: Expert proficiency in SQL, Python or similar language for data processing and automation. Experience building semantic processes for distributed data applications.

Data Pipeline: Experience in building scalable and resilient ELT data pipelines using modern ELT tools like DBT, Azure, Airflow, Fivetran or similar. Knowledge of streaming platforms like Kafka, Spark or similar.

Data Lake & Warehouse: Experience with cloud-based data platforms like Snowflake. Advanced proficiency with SQL, writing complex queries and optimizing performance. Familiar with dimensional and party data models.

Data Governance: Knowledge of data governance practices, including RBAC, RLS, data masking, data lineage and compliance with regulatory standards (e.g., GDPR, HIPAA). Experience with data governance tools such as Datahub or Collibra.

Automation & DevOps: Expertise in CI/CD pipelines with tools such as GitHub Actions, Jenkins, or AWS CodePipeline and containerization (Docker, Kubernetes) to automate and manage data infrastructure. Experience in implementing observability using tools such as Prometheus, Grafana, and AWS CloudWatch. Automated testing frameworks (e.g., JUnit, PyTest).

Agile Methodologies: Experience working in an agile pod or scrum team with the ability to iterate and deliver rapidly.

Machine Learning: Hands on experience with ML infrastructure, frameworks like PyTorch and TensorFlow, Jupyter notebooks

Data Visualization: Experience in software such as Streamlit, Superset, Tableau, Business Objects, and Looker.

Similar Jobs