Platform / Site Reliability Engineer

Apply Now

Company: Axiom Software Solutions Limited

Location: Seattle, WA 98115

Description:

We are looking for a skilled Platform Engineer / SRE to design, implement, and maintain our cloud infrastructure and platforms. The ideal candidate will have a strong background in Kubernetes administration, Azure cloud services, infrastructure as code, and automation. You will play a crucial role in ensuring the scalability, reliability, and security of our systems while supporting our AI/ML initiatives.

* Design, deploy, and manage infrastructure solutions using Terraform, ensuring scalability, security, and reliability.

* Develop and maintain infrastructure as code scripts to automate the provisioning and configuration of resources.

* Ensure version-controlled, repeatable deployments using IaC best practices.

* Implement and manage Kubernetes clusters for containerized applications.

* Collaborate with development teams to deploy, scale, and optimize applications in Kubernetes environments.

* Leverage scripting languages (e.g Python) to automate routine tasks and streamline workflows.

* Implement continuous integration and continuous deployment (CI/CD) pipelines for efficient software delivery.

* Ensure seamless integration of infrastructure components with CI/CD pipelines.

* Design, deploy, and maintain scalable and reliable infrastructure for AI/ML platforms.

* Implement containerization (Docker) and orchestration (Kubernetes) solutions for deploying and managing AI/ML applications.

* Ensure containerized applications are secure, scalable, and easily deployable.

* Enable seamless integration of AI/ML models into the platform, ensuring data pipelines are efficient and reliable.

* Establish monitoring and alerting systems to ensure the health and performance of AI/ML platforms.

* Implement security best practices for AI/ML platforms, ensuring data privacy and compliance with industry standards

* Bachelor's degree in computer science, Engineering, or a related field

* Proven experience in Kubernetes administration, specifically with Azure Kubernetes Service (AKS)

* Strong proficiency in Azure cloud services and Azure ARM templates

* Expert-level scripting skills in PowerShell and Python

* Hands-on experience with Terraform for infrastructure as code

* Solid understanding of CI/CD principles and experience with Azure DevOps

* Experience with containerization technologies, particularly Docker

* Strong problem-solving skills and ability to work in a fast-paced environment

* Excellent communication and collaboration skills

Similar Jobs