On-device ML Infrastructure Engineer (ML Modeling Semantics & Representation)
Apply NowCompany: Apple
Location: Cupertino, CA 95014
Description:
Summary
The On-Device Machine Learning team at Apple is responsible for enabling the Research to Production lifecycle of cutting edge machine learning models that power magical user experiences on Apple's hardware and software platforms. This team sits at the heart of that discipline, interfacing with research, SW engineering, HW engineering, and products.
Our group is looking for an ML Infrastructure Engineer, with a focus on ML model semantics, representation, and optimizations. The role is responsible for working with ML research and Applied research engineers to onboard the newest ML architectures to CoreML's ML model representation, including evolving the representation to support the latest and greatest features in the authored ML program (e.g., PyTorch), and enable the exposure of Apple's on-device execution capabilities. The role is responsible for building critical "bridging" infrastructure between the most used ML frameworks (e.g., PyTorch) and Apple's CoreML stack.
Key responsibilities:
- Develop technologies to quickly onboard new ML models to our on-device stack, including contributions to ML authoring frameworks.
- Understand different ML operations, architectures, and graph representations in different authoring frameworks. Keep abreast of latest innovations in this space.
- Architect and build CoreML's model representation that can efficiently represent program semantics from the authored frameworks, while allowing for peak execution performance.
- Define and build the user-facing model translation and ingestion abstractions, APIs, and surrounding toolkit to allow seamless model import into Apple's ML stack.
- Perform optimizations such as quantization, operator transformations, etc. to make models more amenable to efficient on-device deployment
Description
As an engineer in this role, you will be primarily focused on the interplay between higher-level ML authoring frameworks (such as PyTorch, JAX, MLX, etc.) and Apple's on-device ML infrastructure. The role requires an understanding of ML modeling (architectures, training vs inference trade-offs, etc.) and ML deployment optimizations (compression, distillation, quantization, hardware optimizations, etc.).
We are building the first end-to-end developer experience for ML development that, by taking advantage of Apple's vertical integration, allows developers to iterate on model authoring, optimization, transformation, execution, debugging, profiling and analysis. The ML representation, translation and optimization is the entry point of such infrastructure stack.
The On-Device Machine Learning team at Apple is responsible for enabling the Research to Production lifecycle of cutting edge machine learning models that power magical user experiences on Apple's hardware and software platforms. This team sits at the heart of that discipline, interfacing with research, SW engineering, HW engineering, and products.
Our group is looking for an ML Infrastructure Engineer, with a focus on ML model semantics, representation, and optimizations. The role is responsible for working with ML research and Applied research engineers to onboard the newest ML architectures to CoreML's ML model representation, including evolving the representation to support the latest and greatest features in the authored ML program (e.g., PyTorch), and enable the exposure of Apple's on-device execution capabilities. The role is responsible for building critical "bridging" infrastructure between the most used ML frameworks (e.g., PyTorch) and Apple's CoreML stack.
Key responsibilities:
- Develop technologies to quickly onboard new ML models to our on-device stack, including contributions to ML authoring frameworks.
- Understand different ML operations, architectures, and graph representations in different authoring frameworks. Keep abreast of latest innovations in this space.
- Architect and build CoreML's model representation that can efficiently represent program semantics from the authored frameworks, while allowing for peak execution performance.
- Define and build the user-facing model translation and ingestion abstractions, APIs, and surrounding toolkit to allow seamless model import into Apple's ML stack.
- Perform optimizations such as quantization, operator transformations, etc. to make models more amenable to efficient on-device deployment
Description
As an engineer in this role, you will be primarily focused on the interplay between higher-level ML authoring frameworks (such as PyTorch, JAX, MLX, etc.) and Apple's on-device ML infrastructure. The role requires an understanding of ML modeling (architectures, training vs inference trade-offs, etc.) and ML deployment optimizations (compression, distillation, quantization, hardware optimizations, etc.).
We are building the first end-to-end developer experience for ML development that, by taking advantage of Apple's vertical integration, allows developers to iterate on model authoring, optimization, transformation, execution, debugging, profiling and analysis. The ML representation, translation and optimization is the entry point of such infrastructure stack.