MLOps L2 Support Engineer
Apply NowCompany: Resource Informatics Group
Location: Reading, PA 19606
Description:
Role Title: MLOps L2 Support Engineer
Location City: Reading, PA (Onsite role)
Duration: 12 Months
Rates: DOE
Interview: Two Rounds - Video
Work Schedule: Rotational on-call support (including weekends and nights), shift-based monitoring for ML workflows and Dataiku jobs. Flexible work hours to address production incidents and critical failures.
Role Overview:
We are looking for an experienced MLOps L2 Support Engineer to provide 24/7 production support for Machine Learning (ML) and data pipelines. This role involves on-call rotations, including weekends, to ensure maximum uptime and reliability of ML workflows. The ideal candidate will work hands-on with Dataiku, AWS, CI/CD pipelines, and containerized deployments to maintain and troubleshoot ML models in production environments.
Required Skills & Experience:
Location City: Reading, PA (Onsite role)
Duration: 12 Months
Rates: DOE
Interview: Two Rounds - Video
Work Schedule: Rotational on-call support (including weekends and nights), shift-based monitoring for ML workflows and Dataiku jobs. Flexible work hours to address production incidents and critical failures.
Role Overview:
We are looking for an experienced MLOps L2 Support Engineer to provide 24/7 production support for Machine Learning (ML) and data pipelines. This role involves on-call rotations, including weekends, to ensure maximum uptime and reliability of ML workflows. The ideal candidate will work hands-on with Dataiku, AWS, CI/CD pipelines, and containerized deployments to maintain and troubleshoot ML models in production environments.
Required Skills & Experience:
- Experience: 5+ years in MLOps, Data Engineering, or Production Support
- Dataiku DSS: Advanced knowledge of workflows, plugins, APIs
- Cloud Platforms: Hands-on AWS experience (SageMaker, Lambda, ECS, IAM, etc.)
- CI/CD & Automation: Familiar with tools like GitHub Actions, Jenkins, Terraform
- Scripting: Proficient in Python, Bash, and SQL
- Monitoring Tools: Experience with Prometheus, Grafana, CloudWatch, ELK Stack
- Incident Management: Capable of handling on-call, SLA-based resolutions