SRE Product Manager

Apply Now

Company: Locate Software, Inc.

Location: Bellevue, WA 98006

Description:

Job Description
Job Title: SRE Product Manager

Location: Bellevue,Washington/Remote
Employment Type: Full time

Job Summary:
We are seeking a highly motivated and experienced Site Reliability Engineering (SRE) Product Manager to join our team. In this role, you will bridge the gap between product management and SRE teams, ensuring reliability, scalability, and performance of our critical systems while aligning with business objectives. You will be responsible for defining product strategies and roadmaps that enhance system reliability and collaborating with cross-functional teams to deliver high-quality solutions.

Key Responsibilities:

  • Product Ownership:
    • Define and maintain the roadmap for SRE-related products, tools, and services, ensuring alignment with business objectives and customer needs.
    • Collaborate with engineering and SRE teams to prioritize and deliver reliability-focused initiatives.
  • Reliability and Scalability:
    • Partner with SRE teams to ensure production systems are reliable, scalable, and performant.
    • Identify key metrics (SLAs, SLOs, SLIs) and ensure adherence to these goals through proactive planning.
  • Stakeholder Collaboration:
    • Work with stakeholders, including engineering, operations, and product teams, to align on priorities and objectives.
    • Act as the main point of contact for all reliability-related discussions with leadership and clients.
  • Incident Management and Improvement:
    • Drive post-incident reviews and ensure appropriate follow-ups to improve system reliability and prevent future occurrences.
    • Continuously identify areas for improvement in incident response, monitoring, and automation.
  • Tooling and Automation:
    • Oversee the development and enhancement of SRE tools and platforms to improve efficiency, observability, and system performance.
    • Ensure automation is integrated into reliability processes to minimize manual intervention and reduce downtime.
  • Documentation and Communication:
    • Document processes, tools, and strategies to ensure clarity and alignment across teams.
    • Communicate effectively with technical and non-technical stakeholders about SRE initiatives, updates, and priorities.


Qualifications:

  • Education: Bachelor's degree in Computer Science, Engineering, or a related field. Advanced degree preferred.
  • Experience:
    • Proven experience as a Product Manager, preferably in an SRE or DevOps environment.
    • Strong understanding of Site Reliability Engineering principles, including monitoring, automation, and incident response.
    • Experience working with cloud platforms such as AWS, Azure, or Google Cloud.
  • Skills:
    • Proficiency in defining SLAs, SLOs, and SLIs.
    • Strong project management skills with the ability to prioritize and manage multiple initiatives.
    • Familiarity with modern observability tools (e.g., Prometheus, Grafana, Splunk).

Similar Jobs