Data Engineer
Apply NowCompany: Compunnel Software Group
Location: Suffolk, VA 23434
Description:
Job Summary:
We are seeking a highly skilled Data Engineer to design, develop, and optimize batch ETL pipelines on Databricks within a cloud-based Lakehouse architecture. The role focuses on ingesting high volumes of banking and financial data, ensuring data integrity, governance, and performance for analytics, regulatory, and machine learning use cases. The ideal candidate brings hands-on experience with Apache Spark, Delta Lake, and Azure cloud data services, along with a deep understanding of data compliance and banking regulations.
Job Responsibilities:
Required Skills:
Preferred Skills:
Certifications:
Databricks Certified Data Engineer (Preferred)
Azure/AWS/GCP Data Engineer Certifications (Preferred)
Education: Bachelors Degree
Certification: Databricks Certified Data Engineer , GCP Data Engineer Certifications
We are seeking a highly skilled Data Engineer to design, develop, and optimize batch ETL pipelines on Databricks within a cloud-based Lakehouse architecture. The role focuses on ingesting high volumes of banking and financial data, ensuring data integrity, governance, and performance for analytics, regulatory, and machine learning use cases. The ideal candidate brings hands-on experience with Apache Spark, Delta Lake, and Azure cloud data services, along with a deep understanding of data compliance and banking regulations.
Job Responsibilities:
- Build and Integrate Data Pipelines: Design and implement batch ETL pipelines using Databricks (PySpark, Spark SQL) to support a scalable Lakehouse infrastructure.
- Data Quality and Integrity: Ensure high data quality through validation, cleansing, enrichment, and versioning using Delta Lake ACID features.
- Performance Optimization: Tune Spark jobs, optimize partitioning, and manage dependencies to meet SLAs for large-scale data processing.
- Governance and Compliance: Enforce data security and compliance policies (e.g., encryption, data lineage, access control) per banking regulations (AML, BSA).
- Cross-Team Collaboration: Partner with data architects, analysts, BI, risk, and data science teams to deliver reliable and insightful data pipelines.
- Continuous Improvement: Automate and enhance pipeline workflows using tools such as Databricks Jobs, Delta Live Tables (DLT), and Airflow.
- Support Auditability: Maintain historical data and change tracking for audit and compliance requirements.
- Other Duties: Adhere to federal laws and regulations and perform other duties as assigned.
Required Skills:
- Bachelor's degree in Computer Science or related field (or equivalent experience).
- 3+ years of hands-on experience in data engineering with Databricks and Apache Spark (PySpark, Spark SQL).
- Expertise in Delta Lake and Lakehouse architecture.
- Experience with Databricks Delta Live Tables (DLT) for declarative pipeline development.
- Proficient in Python and SQL; knowledge of Scala/Java is a plus.
- Hands-on experience with Azure or any major cloud platform (e.g., AWS, GCP).
- Familiar with cloud data services (Azure Data Lake Storage, Azure Data Factory, S3, BigQuery).
- Strong data modeling and warehousing knowledge (OLTP/OLAP, star schema, dimensional modeling).
- Pipeline orchestration using Airflow, Databricks Jobs, or ADF.
- Strong understanding of data validation, unit testing, and monitoring practices.
- Awareness of PII handling, encryption, auditing, and compliance standards.
- Experience with version control (Git) and CI/CD for data pipeline deployment.
- Strong analytical, problem-solving, and communication skills.
Preferred Skills:
- Banking or financial data experience (e.g., transactions, fraud, trading, risk analytics).
- Knowledge of regulatory reporting pipelines (e.g., CCAR, AML, Basel).
- Familiarity with real-time streaming (Kafka, Spark Structured Streaming).
- Exposure to DataOps and MLOps practices.
- Relevant certifications (e.g., Databricks Certified Data Engineer, Azure Data Engineer Associate).
Certifications:
Databricks Certified Data Engineer (Preferred)
Azure/AWS/GCP Data Engineer Certifications (Preferred)
Education: Bachelors Degree
Certification: Databricks Certified Data Engineer , GCP Data Engineer Certifications