Site Reliability Engineer Lead
Apply NowCompany: InfiCare Software Technologies
Location: Atlanta, GA 30349
Description:
This position is for a hands-on SRE lead, focused on providing resilient, secure, scalable and supportable services for heterogenous set of application across on-premise and cloud catering to stores systems. You will contribute to the strategy and delivery of the team, as well as managing the day-to-day workload. This role requires building a close relationship with our development teams, operations, engineering, database and product organizations.
You will be involved in the design of resilient systems, the definition and monitoring of SLI/SLO's/BLA's creating pro-active actionable alerts, and also drive production incidents. We operate in a hybrid cloud environment.
Responsibilities
Knowledge And Experience
You will be involved in the design of resilient systems, the definition and monitoring of SLI/SLO's/BLA's creating pro-active actionable alerts, and also drive production incidents. We operate in a hybrid cloud environment.
Responsibilities
- Provide thought-leadership; set the technical direction for the SRE and overall development Team
- Define and manage projects to meet Team objectives.
- Set individual goals and manage personal growth of team members.
- Oversee and guide development to implement SRE strategies for diverse set of SaaS Applications and internal services.
- Serve as the face of a team responsible for the overall health, performance, and capacity of our business applications
- Develop sustainable SRE practices around simplification and standardization
- Drive of the cultural standard for SRE including defining ways of working, runbooks and accountability across people, processes, and technology.
- Partner with other SRE and development teams and lead by example.
Knowledge And Experience
- 1X+ years of Application/Systems engineering in XXxX Production Services environments
- BS in Computer Science, Computer Engineering, Math, or equivalent professional experience
- Experience in designing, deploying and operating SaaS applications and cloud infrastructure (GCP or equivalent & On-Premise virtualized environments)
- Excellent troubleshooter spanning systems, networks and code , utilizing a systematic problem-solving approach
- Demonstrate the ability to lead diverse SRE and development teams.
- Fluency with one or more current generation scripting language used by SRE/DevOps professionals.
- Proficiency in monitoring and performance tools like Dynatrace, Splunk, Google Analytics, ELK.
- Should have experience on google cloud to implement SRE strategies.
- Strong communication skills