Software Engineer - data orchestration

Apply Now

Company: Parable

Location: New York, NY 10025

Description:

Overview

We are opening the search for a critical role at Parable, and hiring a Software Engineer focused on Data Orchestration.

This person will play an essential role in building the data infrastructure that transforms how companies understand and optimize their most precious resource - time.

As a key member of our data platform team, you'll design and implement the scalable data orchestration systems that power our AI-driven insights, working directly with our ML and AI Engineering teams to ensure data flows seamlessly throughout our platform.

If you're excited about building sophisticated data systems while working with seasoned entrepreneurs on a mission to make time matter in a world that hijacks our attention, we'd love to talk.
This role is for someone who:
  • Is passionate about building robust, scalable data systems. You're not just a developer - you're an architect who thinks deeply about data flows, pipeline efficiency, and system reliability. You've spent years building data infrastructure, and you're constantly exploring new approaches and technologies.
  • Combines technical excellence with business impact. You can architect complex data orchestration systems and write efficient code, but you never lose sight of what truly matters - enabling Research teams to deliver insights that to customers. You're as comfortable diving deep into technical specifications as you are collaborating with ML engineers to understand their data processing needs.
  • Has deep expertise in data engineering. You understand the intricacies of building reliable data pipelines at scale, with experience in modern data processing frameworks like PySpark and Polars. You have a knack for solving complex data integration challenges and a passion for data quality and integrity.
  • Is a lean experimenter at heart. You believe in shipping to learn, but you also know how to build for scale. You have a track record of delivering results in one-third the time that most competent engineers think possible, not by cutting corners, but through smart architectural decisions and iterative development.
  • Exercises extreme ownership. You take full responsibility for your work, cast no blame, and make no excuses. When issues arise, you're the first to identify solutions rather than point fingers. You see it as your obligation to challenge decisions when you disagree, and seek the scrutiny of your own ideas.
You will be responsible for:
  • Working closely with ML and AI Engineering teams to design, build, and maint orchestration solutions and pipelines at enable ML/AI teams to self serve the development and deployment of data flows at scale.
  • Ensuring data integrity, quality, privacy, security, and accessibility to internal and external clients
  • Participate in developing robust systems for data ingestion, transformation, and delivery across our platform
  • Creating efficient data workflows that optimize for both performance, resource utilization, and AI/ML team usage.
  • Implementing monitoring and observability solutions for data pipelines to ensure reliability
  • Researching and experimenting with new data platform technologies and solutions
  • Establishing best practices for data orchestration and pipeline development
  • Collaborating with cross-functional teams to understand data requirements and deliver solutions
  • Contributing to our infrastructure-as-code practices on Google Cloud Platform
In your first 3 months, you'll:
  • Work with our Data Platform Team + ML Team to build highly-scalable data pipelines, data lakes, and orchestration services
  • Enable the ML and AI Engineering teams to deploy their solutions with reliable and efficient data processing workflows
  • Help lay the groundwork for a scalable and secure data practice
  • Write production-grade code in Python, Rust, and SQL
  • Contribute to our Google Cloud Platform infrastructure using Infrastructure as Code
  • Implement monitoring and alerting for critical data pipelines
  • Experiment rapidly to deliver learnings and results in the first month
  • Help foster a community of technical and professional development
Requirements:
  • 5+ years of experience building enterprise-grade data products and systems
  • Strong expertise in data orchestration frameworks and technologies
  • Demonstrated experience with PySpark, Polars, data lakes, and distributed data processing concepts
  • Proficiency in Python and/or Rust for production pipeline code
  • Experience connecting and integrating external data sources, specifically SaaS APIs
  • Familiarity with cloud platforms, particularly Google Cloud Platform
  • Knowledge of data modeling, schema design, and data governance principles
  • Experience with containerization and infrastructure-as-code
  • Bachelor's degree in Computer Science, Machine Learning, Information Science, or related field preferred

Similar Jobs