AI/ML Site Reliability Engineer (SRE)
Apply NowCompany: Lockheed Martin
Location: King Of Prussia, PA 19406
Description:
Job Description
Space is a critical domain, connecting our technologies, our security and our humanity. While others view space as a destination, we see it as a realm of possibilities, where we can do more - we can innovate, invest, inspire and integrate our capabilities to transform the future.
At Lockheed Martin Space, we aim to harness the full potential of space to cultivate innovation, reduce costs, and push the boundaries of what technology can achieve. We're creating future-ready solutions, focusing on resiliency and urgency through our 21st Century Security vision. We're erasing boundaries and forming partnerships across industries and around the world. We're advancing spacecraft and the workforce to fuel the next generation. And we're reimagining how space can connect us, ensuring security and prosperity.
Join us in shaping a new era in space and find a career that's built for you.
Job Description
We are seeking an experienced Site Reliability Engineer (SRE) to join our team, responsible for designing, building, and maintaining the infrastructure for a new Artificial Intelligence (AI) and Machine Learning (ML) environment. As an SRE, you will focus on provisioning, deploying, and managing the underlying infrastructure and tools that support the development, testing, and deployment of AI/ML models. You will work closely with data science, engineering, and product teams to ensure that the AI/ML systems are scalable, reliable, and performant and help to upskill existing staff to implement best practices for software maintenance and management, as well as server and workstation provisioning.
Responsibilities:
+ Design, build, and maintain the underlying infrastructure for AI/ML systems, including compute resources, storage, networking, and security
+ Provision and manage AI/ML tools and frameworks, such as TensorFlow, PyTorch, and scikit-learn, to support the development, testing, and deployment of AI/ML models
+ Deploy and manage AI/ML environments, including development, testing, and production environments, to support the development, testing, and deployment of AI/ML models
+ Ensure that AI/ML systems are scalable and performant, by designing and implementing efficient architectures, and optimizing resource utilization
+ Collaborate with data science, engineering, and product teams to ensure that AI/ML systems meet business requirements and are properly integrated with other systems
+ Troubleshoot and resolve issues with AI/ML infrastructure and tools, to ensure that systems are running smoothly and efficiently
+ Stay current with the latest developments in DEVSECOPS/SRE community of practice, and apply this knowledge to continuously improve our infrastructure and tools
This position is contingent upon the program award expected in Spring of 2025
Basic Qualifications
+ Experience with HPC hardware such as GPU-based systems (e.g., NVIDIA Tesla, Quadro), high-performance CPUs (e.g., Intel Xeon, AMD EPYC), and high-speed storage systems
+ Experience with AI/ML-specific hardware
+ Experience with networking storage fundamentals (e.g., block storage, object storage, file systems)
+ Programming skills in languages such as Python, Java, or C++
+ Experience with AI/ML
+ Experience with containerization technologies such as Docker or Kubernetes
Must have an active TS/SCI security clearance to start.
Desired skills
+ Experience with TensorFlow, PyTorch, or scikit-learn
+ Experience with NVIDIA DGX and H series appliances
+ Experience with monitoring and logging tools such as Splunk
+ Experience with Agile project management methodologies (e.g. Scrum, Kanban)
+ Experience with CI/CD Tools such as Jenkins, GitLab or Rancher
EEO
Lockheed Martin is an equal opportunity employer. Qualified candidates will be considered without regard to legally protected characteristics.
The application window will close in 90 days; applicants are encouraged to apply within 5 - 30 days of the requisition posting date in order to receive optimal consideration.
*
At Lockheed Martin, we use our passion for purposeful innovation to help keep people safe and solve the world's most complex challenges. Our people are some of the greatest minds in the industry and truly make Lockheed Martin a great place to work.
With our employees as our priority, we provide diverse career opportunities designed to propel, develop, and boost agility. Our flexible schedules, competitive pay, and comprehensive benefits enable our employees to live a healthy, fulfilling life at and outside of work. We place an emphasis on empowering our employees by fostering an inclusive environment built upon integrity and corporate responsibility.
If this sounds like a culture you connect with, you're invited to apply for this role. Or, if you are unsure whether your experience aligns with the requirements of this position, we encourage you to search on Lockheed Martin Jobs, and apply for roles that align with your qualifications.
Other Important Information
By applying to this job, you are expressing interest in this position and could be considered for other career opportunities where similar skills and requirements have been identified as a match. Should this match be identified you may be contacted for this and future openings.
Ability to work remotely
Part-time Remote Telework: The employee selected for this position will work part of their work schedule remotely and part of their work schedule at a designated Lockheed Martin facility. The specific weekly schedule will be discussed during the hiring process.
Work Schedule Information
Lockheed Martin supports a variety of alternate work schedules that provide additional flexibility to our employees. Schedules range from standard 40 hours over a five day work week while others may be condensed. These condensed schedules provide employees with additional time away from the office and are in addition to our Paid Time off benefits.
National Pay Statement
Pay Rate: The annual base salary range for this position in California and New York (excluding most major metropolitan areas), Colorado, Hawaii, Illinois, Maryland, Minnesota, Washington or Washington DC is $113,900 - $200,905. For states not referenced above, the salary range for this position will reflect the candidate's final work location. Please note that the salary information is a general guideline only. Lockheed Martin considers factors such as (but not limited to) scope and responsibilities of the position, candidate's work experience, education/ training, key skills as well as market and business considerations when extending an offer.
Benefits offered: Medical, Dental, Vision, Life Insurance, Short-Term Disability, Long-Term Disability, 401(k) match, Flexible Spending Accounts, EAP, Education Assistance, Parental Leave, Paid time off, and Holidays.
(Washington state applicants only) Non-represented full-time employees: accrue at least 10 hours per month of Paid Time Off (PTO) to be used for incidental absences and other reasons; receive at least 90 hours for holidays. Represented full time employees accrue 6.67 hours of Vacation per month; accrue up to 52 hours of sick leave annually; receive at least 96 hours for holidays. PTO, Vacation, sick leave, and holiday hours are prorated based on start date during the calendar year.
This position is incentive plan eligible.
Premium Pay Statement
Pay Rate: The annual base salary range for this position in most major metropolitan areas in California and New York is $131,000 - $227,125. For states not referenced above, the salary range for this position will reflect the candidate's final work location. Please note that the salary information is a general guideline only. Lockheed Martin considers factors such as (but not limited to) scope and responsibilities of the position, candidate's work experience, education/ training, key skills as well as market and business considerations when extending an offer.
Benefits offered: Medical, Dental, Vision, Life Insurance, Short-Term Disability, Long-Term Disability, 401(k) match, Flexible Spending Accounts, EAP, Education Assistance, Parental Leave, Paid time off, and Holidays.
This position is incentive plan eligible.
Space is a critical domain, connecting our technologies, our security and our humanity. While others view space as a destination, we see it as a realm of possibilities, where we can do more - we can innovate, invest, inspire and integrate our capabilities to transform the future.
At Lockheed Martin Space, we aim to harness the full potential of space to cultivate innovation, reduce costs, and push the boundaries of what technology can achieve. We're creating future-ready solutions, focusing on resiliency and urgency through our 21st Century Security vision. We're erasing boundaries and forming partnerships across industries and around the world. We're advancing spacecraft and the workforce to fuel the next generation. And we're reimagining how space can connect us, ensuring security and prosperity.
Join us in shaping a new era in space and find a career that's built for you.
Job Description
We are seeking an experienced Site Reliability Engineer (SRE) to join our team, responsible for designing, building, and maintaining the infrastructure for a new Artificial Intelligence (AI) and Machine Learning (ML) environment. As an SRE, you will focus on provisioning, deploying, and managing the underlying infrastructure and tools that support the development, testing, and deployment of AI/ML models. You will work closely with data science, engineering, and product teams to ensure that the AI/ML systems are scalable, reliable, and performant and help to upskill existing staff to implement best practices for software maintenance and management, as well as server and workstation provisioning.
Responsibilities:
+ Design, build, and maintain the underlying infrastructure for AI/ML systems, including compute resources, storage, networking, and security
+ Provision and manage AI/ML tools and frameworks, such as TensorFlow, PyTorch, and scikit-learn, to support the development, testing, and deployment of AI/ML models
+ Deploy and manage AI/ML environments, including development, testing, and production environments, to support the development, testing, and deployment of AI/ML models
+ Ensure that AI/ML systems are scalable and performant, by designing and implementing efficient architectures, and optimizing resource utilization
+ Collaborate with data science, engineering, and product teams to ensure that AI/ML systems meet business requirements and are properly integrated with other systems
+ Troubleshoot and resolve issues with AI/ML infrastructure and tools, to ensure that systems are running smoothly and efficiently
+ Stay current with the latest developments in DEVSECOPS/SRE community of practice, and apply this knowledge to continuously improve our infrastructure and tools
This position is contingent upon the program award expected in Spring of 2025
Basic Qualifications
+ Experience with HPC hardware such as GPU-based systems (e.g., NVIDIA Tesla, Quadro), high-performance CPUs (e.g., Intel Xeon, AMD EPYC), and high-speed storage systems
+ Experience with AI/ML-specific hardware
+ Experience with networking storage fundamentals (e.g., block storage, object storage, file systems)
+ Programming skills in languages such as Python, Java, or C++
+ Experience with AI/ML
+ Experience with containerization technologies such as Docker or Kubernetes
Must have an active TS/SCI security clearance to start.
Desired skills
+ Experience with TensorFlow, PyTorch, or scikit-learn
+ Experience with NVIDIA DGX and H series appliances
+ Experience with monitoring and logging tools such as Splunk
+ Experience with Agile project management methodologies (e.g. Scrum, Kanban)
+ Experience with CI/CD Tools such as Jenkins, GitLab or Rancher
EEO
Lockheed Martin is an equal opportunity employer. Qualified candidates will be considered without regard to legally protected characteristics.
The application window will close in 90 days; applicants are encouraged to apply within 5 - 30 days of the requisition posting date in order to receive optimal consideration.
*
At Lockheed Martin, we use our passion for purposeful innovation to help keep people safe and solve the world's most complex challenges. Our people are some of the greatest minds in the industry and truly make Lockheed Martin a great place to work.
With our employees as our priority, we provide diverse career opportunities designed to propel, develop, and boost agility. Our flexible schedules, competitive pay, and comprehensive benefits enable our employees to live a healthy, fulfilling life at and outside of work. We place an emphasis on empowering our employees by fostering an inclusive environment built upon integrity and corporate responsibility.
If this sounds like a culture you connect with, you're invited to apply for this role. Or, if you are unsure whether your experience aligns with the requirements of this position, we encourage you to search on Lockheed Martin Jobs, and apply for roles that align with your qualifications.
Other Important Information
By applying to this job, you are expressing interest in this position and could be considered for other career opportunities where similar skills and requirements have been identified as a match. Should this match be identified you may be contacted for this and future openings.
Ability to work remotely
Part-time Remote Telework: The employee selected for this position will work part of their work schedule remotely and part of their work schedule at a designated Lockheed Martin facility. The specific weekly schedule will be discussed during the hiring process.
Work Schedule Information
Lockheed Martin supports a variety of alternate work schedules that provide additional flexibility to our employees. Schedules range from standard 40 hours over a five day work week while others may be condensed. These condensed schedules provide employees with additional time away from the office and are in addition to our Paid Time off benefits.
National Pay Statement
Pay Rate: The annual base salary range for this position in California and New York (excluding most major metropolitan areas), Colorado, Hawaii, Illinois, Maryland, Minnesota, Washington or Washington DC is $113,900 - $200,905. For states not referenced above, the salary range for this position will reflect the candidate's final work location. Please note that the salary information is a general guideline only. Lockheed Martin considers factors such as (but not limited to) scope and responsibilities of the position, candidate's work experience, education/ training, key skills as well as market and business considerations when extending an offer.
Benefits offered: Medical, Dental, Vision, Life Insurance, Short-Term Disability, Long-Term Disability, 401(k) match, Flexible Spending Accounts, EAP, Education Assistance, Parental Leave, Paid time off, and Holidays.
(Washington state applicants only) Non-represented full-time employees: accrue at least 10 hours per month of Paid Time Off (PTO) to be used for incidental absences and other reasons; receive at least 90 hours for holidays. Represented full time employees accrue 6.67 hours of Vacation per month; accrue up to 52 hours of sick leave annually; receive at least 96 hours for holidays. PTO, Vacation, sick leave, and holiday hours are prorated based on start date during the calendar year.
This position is incentive plan eligible.
Premium Pay Statement
Pay Rate: The annual base salary range for this position in most major metropolitan areas in California and New York is $131,000 - $227,125. For states not referenced above, the salary range for this position will reflect the candidate's final work location. Please note that the salary information is a general guideline only. Lockheed Martin considers factors such as (but not limited to) scope and responsibilities of the position, candidate's work experience, education/ training, key skills as well as market and business considerations when extending an offer.
Benefits offered: Medical, Dental, Vision, Life Insurance, Short-Term Disability, Long-Term Disability, 401(k) match, Flexible Spending Accounts, EAP, Education Assistance, Parental Leave, Paid time off, and Holidays.
This position is incentive plan eligible.