Databricks Database Engineer

Apply Now

Company: Sev1Tech

Location: Arlington, VA 22209

Description:

Overview/ Job Responsibilities

We are seeking a motivated and experienced Data Engineer to join our team. As a Data Engineer, you will be at the forefront developing ETL pipelines from various sources, data models for business users to create dashboards and ad-hoc reporting, along with efficient pipelines end-to-end for machine learning initiatives. In this role you will work closely with cross-functional teams to drive innovation and enhance data analytics capabilities utilizing Databricks. This position offers a unique opportunity to work with cutting-edge technologies and shape the future of our federal customer's data infrastructure.

If you are a motivated and experienced Data Engineer with expertise in Databricks and passion for building scalable and secure data solutions, we would like to hear from you.

Responsibilities:

  • Design and implement data ingestion pipelines to efficiently ELT/ETL data from various sources into Databricks.
  • Develop, implement, and validate logical and physical models while expanding on enterprise Lakehouse.
  • Experience with AWS technologies, relational and non-relational databases (e.g., MySQL, NoSQL, Postgress; ideally MONGO experience).
  • Work with scalable search technology (e.g. ElasticSearch, Solr).
  • Support lead engineer in the architecture and engineering of complex data models, workflows, and customer support.
  • Collaborate with cross-functional teams to gather requirements, understand data integration needs, and define data lake architecture and governance policies.
  • Establish, maintain, and configure Databricks workspaces, clusters, and storage components, optimizing the solution for efficient data processing, query performance, and data governance.
  • Develop and maintain data lake security frameworks, including access controls, encryption solutions, and data masking techniques to protect sensitive data.
  • Develop code or scripts to ingest data from web-based APIs
  • Collaborate with data engineers, data scientists, data analyst, and business partners to optimize data pipelines, develop data transformations, and ensure data quality and integrity.
  • Monitor and tune Databricks workflows to ensure performance, reliability, and cost optimization, utilizing automated scaling and resource management techniques.
  • Implement best practices for data governance, data cataloging, metadata management, and data lineage within Databricks, adhering to regulatory and compliance requirements.
  • Collaborate with infrastructure teams to ensure Databricks infrastructure meets scalability and availability requirements, leveraging Databricks cluster management and AWS/Azure services.
  • Develop and maintain documentation and guidelines related to the Databricks solution, including architecture diagrams, standards, and processes.
  • Stay up to date with the latest advancements in Databricks, big data technologies, and cloud platforms, continuously evaluating and implementing new features and capabilities.
  • Provide technical guidance and mentorship to junior data engineers and our customers to promote best practices and fostering a culture of continuous learning and growth.
  • Collaborate with stakeholders to understand their data analytics and reporting needs and develop scalable data models and data transformation processes to support these requirements.
  • Support data lake-related incident resolutions, troubleshooting data quality issues, performance bottlenecks, and other data-related challenges.
  • Collaborate with data governance and compliance teams to ensure data privacy, security, and compliance guidelines are adhered to within the data lake solution.
  • Participate in the evaluation and selection of new tools, technologies, and services to enhance the data lake infrastructure.
  • Handle large data sets and scale their handling and storage.
  • Author developer-friendly documentation (e.g., API documentation, deployment operations).
  • Hands on experience writing code in Python/Pyspark.


Minimum Qualifications

  • Bachelor's degree in computer science, information technology, or a related field. Equivalent experience will also be considered.
  • Proven experience in creating data pipelines Databricks environment from scratch.
  • Proven experience in optimizing existing data pipelines and data modeling.
  • In-depth knowledge of Databricks, including but not limited to workspaces, clusters, storage, notebook development, and automation capabilities.
  • Strong expertise in designing and implementing data ingestion pipelines, data transformations, and data quality processes using Databricks.
  • Experience with big data technologies such as Apache Spark, Apache Hive, Delta Lake, and Hadoop.
  • Solid understanding of data governance principles, data modeling, data cataloging, and metadata management.
  • Hands-on experience with cloud platforms like AWS or Azure, including relevant services like S3, EMR, Glue, Data Factory, etc.
  • Proficiency in SQL and one or more programming languages (Python, Scala, or Java) for data manipulation and transformation.
  • Knowledge of data security and privacy best practices, including data access controls, encryption, and data masking techniques.
  • Strong problem-solving and analytical skills, with the ability to identify and resolve complex data-related issues.
  • Experience providing technical guidance and mentorship to team members new to Databricks and data engineering, data warehousing best practices.
  • Relevant certifications such as Databricks Certified Developer or Databricks Certified Professional are highly desirable.


Desired Qualifications

Desired Qualifications:
  • Experience building and maintaining data warehouse and large-scale data transformations.
  • Experience building dashboards/reports in Power BI or Tableau.
  • Experience deploying machine learning models, deploying and maintaining docker containers.
  • Experience promoting into production/CI/CD practices.
  • Experience working or administering Windows or Linux server.
  • Strong working knowledge of NIST 800.37 and 800.53 requirements

Clearance Preference:
  • Active DHS/CISA suitability - 1st priority
  • Any DHS badge + DoD Top Secret - 2nd choice
  • DoD Top Secret + willingness to obtain DHS/CISA suitability - 3rd choice (it can take 10-60 days to obtain suitability - work can only begin once suitability is fully adjudicated)


About Sev1Tech LLC

Welcome to Sev1Tech! Founded in 2010, we are proud to be a leading provider of IT modernization, engineering, and program management solutions. Our commitment is to deliver exceptional program and IT support services that empower critical missions for both Federal and Commercial clients.

At Sev1Tech, our mission is clear: Build better companies. Enable better government. Protect our nation. Build better humans across the country. We believe that through innovation and dedication, we can make a significant impact on the communities we serve.

Join the Sev1Tech family, where your potential for greatness is limitless! Here, you will not only achieve remarkable accomplishments but also enjoy a fulfilling and rewarding career progression. We invite you to explore opportunities with us and become part of a team that values your contributions and growth.

Ready to take the next step? Apply directly through our website: Sev1Tech Careers and use the hashtag #joinSev1Tech to connect with us on social media!

For any additional questions or to submit referrals, feel free to reach out to troy.ester@sev1tech.com.

Similar Jobs