AI Engineer
Apply NowCompany: Phare Health
Location: New York, NY 10025
Description:
About Us
Our mission is to make healthcare reimbursement fair and transparent, so providers can spend more time caring for patients and less time haggling over costs. We specifically focus on the most complex AI challenges that require novel R&D, building products that are fit for purpose in healthcare. We are backed by some of the top healthcare investors and growing fast. Join us!
The Role
As an AI Engineer on our team, you will architect and optimize the training and inference infrastructure that underpins our healthcare language models. You'll collaborate closely with research scientists, product teams, and end-users to ensure our AI solutions are robust, scalable, and deployable for real-world clinical applications. You will work with state-of-the-art open-source LLMs running on GPUs, helping us advance healthcare NLP in production environments.
We are looking for a candidate who can come into our NYC (Soho) office 3+ days per week.
Key Responsibilities
Technical Expertise
Initial Application:
Submit your resume/LinkedIn and a brief statement about why you're interested. Intro Call:
Discuss your background, career goals, and our mission to see if there's a mutual fit. Technical Interviews (2x):
Includes a programming or system design exercise focused on large-scale training/inference and GPU workflows. Referees:
Provide 2 references who can speak to your professional/technical accomplishments. Culture Interview:
Explore ways of working, team fit, and give you a chance to ask questions. Offer
We'll extend a competitive offer for the right candidate to join our growing team.
Our mission is to make healthcare reimbursement fair and transparent, so providers can spend more time caring for patients and less time haggling over costs. We specifically focus on the most complex AI challenges that require novel R&D, building products that are fit for purpose in healthcare. We are backed by some of the top healthcare investors and growing fast. Join us!
The Role
As an AI Engineer on our team, you will architect and optimize the training and inference infrastructure that underpins our healthcare language models. You'll collaborate closely with research scientists, product teams, and end-users to ensure our AI solutions are robust, scalable, and deployable for real-world clinical applications. You will work with state-of-the-art open-source LLMs running on GPUs, helping us advance healthcare NLP in production environments.
We are looking for a candidate who can come into our NYC (Soho) office 3+ days per week.
Key Responsibilities
- LLM Training & Inference Infrastructure:
Develop and maintain GPU-accelerated systems for large-scale training and inference, ensuring high throughput and low latency. Optimize distributed training pipelines, handle multi-node clusters, and evaluate state-of-the-art frameworks for open-source language models. - Model Optimization & Deployment:
Implement techniques such as model parallelism, quantization, knowledge distillation, and efficient serving to deliver cost-effective and fast inference for mission-critical healthcare applications. - Collaboration with AI Research:
Work with our NLP research team to integrate new model architectures, fine-tuned weights, and evaluation benchmarks into production pipelines. Establish best practices for version control, reproducible experiments, and continuous model improvement. - Healthcare Data Integration:
Collaborate with data engineering teams to ingest and preprocess large clinical datasets (EHR, claims data, etc.) in GPU-friendly formats. Help define secure and scalable data workflows to comply with healthcare regulations. - Monitoring & Scalability:
Set up monitoring, logging, and alerting for AI systems in production, ensuring uptime and performance metrics are met. Implement strategies for autoscaling and distributed resource management. - Technical Leadership & R&D:
Stay current with the latest research in large-scale machine learning, GPU acceleration, and MLOps. Champion best practices to the broader team, sharing insights through presentations, docs, and code reviews.
Technical Expertise
- Educational Background:
MS/PhD in Computer Science, Electrical Engineering, or a related field (or equivalent industry experience). - Hands-on Experience:
2+ years of building and optimizing ML infrastructure for large-scale training and inference. Familiarity with GPU-accelerated computing, distributed systems, and open-source LLMs. - Deep Knowledge of ML & MLOps:
Proficiency in Python and frameworks like PyTorch for large-scale model training. Experience with containerization (Docker/Kubernetes), experiment tracking, CI/CD, and monitoring for AI systems. - Performance Tuning & Deployment:
Track record of improving inference efficiency and throughput via techniques like model parallelism, quantization, or knowledge distillation. - Startup Mindset:
Comfortable with ambiguity, rapid iteration, and owning projects end-to-end. Driven to deliver meaningful outcomes and iterate quickly on user feedback.
- Competitive Compensation:
Top-of-market salary plus equity. - Flexible PTO:
Generous vacation policy and a culture that supports work-life balance. - Team Culture:
Collaborative environment with regular team-building events. Mission-driven work that makes a tangible impact in healthcare.
Submit your resume/LinkedIn and a brief statement about why you're interested.
Discuss your background, career goals, and our mission to see if there's a mutual fit.
Includes a programming or system design exercise focused on large-scale training/inference and GPU workflows.
Provide 2 references who can speak to your professional/technical accomplishments.
Explore ways of working, team fit, and give you a chance to ask questions.
We'll extend a competitive offer for the right candidate to join our growing team.