Site Reliability Developer

Apply Now

Company: Tata Consultancy Services

Location: Austin, TX 78745

Description:

Site Reliability Engineer (SRE) Apple PaySite ReliabilityEngineer (SRE) to join the Apple Pay team. The ideal candidate willexcel in monitoring, debugging, and production support toensure high availability and reliability of critical payment services.
Key Responsibilities:
  • Proactively monitor and troubleshoot production systems to minimize downtime and ensure optimal performance.
  • Provide L2 support for production incidents, including effective collaboration with internal teams and external partners.
  • Work with cloud technologies (AWS or similar) and leverage Kubernetes for container orchestration and scaling.
  • Conduct root cause analysis (RCA) for incidents and implement preventative measures.
  • Optimize system performance through automation, monitoring improvements, and best practices.

Qualifications:
  • Hands-on experience in production support and incident management.
  • Strong expertise in monitoring tools (e.g., Prometheus, Grafana, Datadog).
  • Solid debugging and troubleshooting skills in large-scale, distributed systems.
  • Familiarit y with cloud platforms and Kubernetes is highly desirable.
  • Excellent communication skills for effective collaboration with partners.

Preferred Skills:
  • Experience with SRE principles like SLIs, SLOs, and error budgets.

    Salary Range: $67,100 - $135,000 a year

Similar Jobs