Cloud SRE and Tools Engineer

Apply Now

Company: Charles Schwab

Location: Austin, TX 78701

Description:

Your Opportunity

At Schwab, you're empowered to make an impact on your career. Here, innovative thought meets creative problem solving, helping us "challenge the status quo" and transform the finance industry together.

Charles Schwab's Technology Services Technical Manager's thrive in a leading-edge work culture while focusing on the products and solutions that help Schwab customers learn, explore, and make life-impacting moves on their paths to achieving their goals. This position requires a highly motivated individual with strong problem-solving skills who can contribute to a highly collaborative culture and a team environment to deliver innovative, value-based reliable solutions. A proven track record of delivering high quality technology products and services in a hyper-growth environment where priorities shift quickly is key to success in the role.

This Cloud Site Reliability Engineering role is a hands-on technical role and will be responsible for executing the teams' activities on tools engineering, high availability, maintainability, support, and automaton of complex application platforms and microservices.

Lead the execution of various SRE tasks for the Cross Enterprise organizations.
Communicate technology plans to associated business partners ensuring collaboration across technology and the business.
Regularly interact with technology leaders, product owners and business partners.
Partner with the team on complex issues where analysis of situations or data requires in-depth knowledge of the applications and environment.
Partner with highly experienced technologists, contributing to and creating successful plans to deliver solutions to the business, ensure day-to-day support, high availability, and process compliance.
Support Operational delivery and Production focusing on proactive monitoring, rapid response Platform SRE
Perform proactive daily system monitoring including reviewing system and application logs as well as responding to, triaging, troubleshooting and remediating incidents.
Repair and recover from failures. Coordinate and communicate with impacted stakeholders and clients, escalating where appropriate.
Monitor and troubleshoot issues across the entire stack - software, application, and network.
Develop automation and processes to enable teams to deploy, manage, configure, scale, and monitor their applications.
Help identify applications reliability and availability improvements, establish, and build solutions to continue to drive an improved experience.
Develop and manage continuous deployment and integrate solutions.
Create and review documentation and process regarding recurring issues, new standard operating procedures, knowledge transfer material, etc.
Collaborate with Engineering, Scrum and Ops resources to provide technical expertise and support on key initiatives for system availability and reliability.

What you have

4+ years of experience in Linux/Java Software Development & Architecture, Operations, DevOps, etc.
Demonstrated ability to resolve business and service impacting problems, evaluating all alternatives, and consulting with other technical members of the organization is required.
Familiarity with database management systems (Oracle, SQL)
Knowledge of Platform as a Service (PaaS) and Infrastructure as a Service (IaaS)
Experience with Continuous Integration/Continuous Delivery (Bamboo, Go or other related tools)
Experience with Git, JIRA and related Atlassian stack
Experience with environment provisioning and deployment automation (Salt/Chef/Puppet)
Ability to work with global teams.
Flexibility to operate in an environment with changing demands and priorities.
Experience with Terraform is preferred.
Experience working in Cloud Environments (AWS, GCP, or Azure) highly desired.

Cloud SRE and Tools Engineer

Similar Jobs