SRE Engineer

Apply Now

Company: Tata Consultancy Services

Location: Miami, FL 33186

Description:

Objectives of this role

Run the production environment by monitoring availability and taking a holistic view of system health

Build software and systems to manage platform infrastructure and applications

Improve reliability, quality, and time-to-market of our suite of software solutions

Measure and optimize system performance, with an eye toward pushing our capabilities forward, getting ahead of customer needs, and innovating for continual improvement

Provide primary operational support and engineering for multiple large-scale distributed software applications

Responsibilities

Gather and analyze metrics from operating systems as well as applications to assist in performance tuning and fault finding

Partner with development teams to improve services through rigorous testing and release procedures

Participate in system design consulting, platform management, and capacity planning

Create sustainable systems and services through automation and uplifts

Balance feature development speed and reliability with well-defined service-level objectives

Role

SRE(Site Reliability Engineer) /Lead AppDynamics, Splunk, New Relic, Datadog, CloudWatch, Akamai with strong technical experience full stack Oversee the SRE team, ensuring high availability and reliability of services.

Manage incidents and drive post-mortem analyses to prevent recurrence.

Liaise with management to provide updates on service reliability metrics and team performance.

Implement monitoring, alerting, and incident response strategies.

Conduct on-call duties and participate in incident response.

Contribute to post-mortem analysis and service improvement efforts

Provide technical support and troubleshooting assistance

Respond to support tickets, diagnose issues, and offer solutions.

Assist in maintaining documentation of known issues and resolutions.

Salary Range-$100,000-$130,000 a year

#LI-KR1

Similar Jobs