Engineer

Apply Now

Company: Tata Consultancy Services

Location: Jersey City, NJ 07305

Description:

Skill: Site reliability engineering-Senior Engineer

Must have skills:

  • Python or Java.
  • Splunk Cloud, Thousand Eyes, cloud platforms such as AWS, Google Cloud, or Azure.
  • Docker and Kubernetes.


Responsibilities:

  • System Reliability: Work with production support teams to implement scalable, maintainable systems, continuously seeking improvements and optimizations in infrastructure and application architecture.
  • Toil Reduction - Automation: Build and maintain tools and scripts for automating repetitive tasks, deployment processes, monitoring, and incident responses, reducing manual interventions and minimizing human errors.
  • Incident Management: Participate in major incidents (on-call rotations), respond to incidents and service outages, promptly investigate and resolve system issues, and conduct post-mortems to prevent future incidents through Problem management.
  • Monitoring and Alerting: Establish and maintain monitoring and alerting systems to proactively identify potential issues, ensuring timely notifications to relevant teams during critical situations.
  • Capacity Planning and Performance Optimization: Monitor system performance, identify bottlenecks, collaborate with engineering teams for performance optimization, and plan for future growth.
  • Error Budgeting and Chaos Engineering: Diagnose and recommend optimization opportunities, conducts mock drills to improve stability and resiliency.
  • Documentation: Develop and maintain comprehensive documentation for system configurations, processes, and troubleshooting procedures to enhance knowledge sharing and team efficiency.


Minimum Qualifications -

  • Knowledgeable in cloud platforms such as AWS, Google Cloud, or Azure, and familiar with containerization technologies like Docker and Kubernetes.
  • Proficient in using infrastructure-as-code tools like Terraform and Ansible for automation and configuration management.


Preferred Qualifications -

  • Experienced in software development with proficiency in programming languages like Python or Java.
  • Familiar with monitoring and logging tools such as Splunk Cloud, ThousandEyes.
  • Understands networking principles and protocols.
  • Capable of working collaboratively in a fast-paced, dynamic environment with excellent problem-solving skills.


Salary Range - $105,000-$125,000 a year

#LI-NR3

Similar Jobs