DevOps/Site Reliability Engineer

Apply Now

Company: Seven Seven Software

Location: Alpharetta, GA 30022

Description:

Responsibilities:

* Maintain applications once they are live by measuring and monitoring availability, latency and overall system health with a focus on business activities and continuously evaluate cost and waste

* Engage in and improve the whole lifecycle of services from inception and design, through deployment, operation, capacity planning and launch reviews.

* Scale systems sustainably through mechanisms like automation and evolve systems by pushing for changes that improve reliability and velocity; includes automation for other various operational needs.

* Troubleshooting infrastructure issues, reviewing log files, updating documentation, and having knowledge base with resolutions

* Work closely with the application Development team to understand the platform and create tools/utilities to help with production management

* Working with upstream data providers and upstream consumers, and reducing the amount of escalation to development teams

* Develop scripts and assist with code changes along with operational tasks/activities.

* Work closely with Application Development to ensure that the support team has excellent knowledge of the application set, own and maintain support knowledgebase and documents

* Use of analytical skills to find trends in the environment and drive out problems.

* Lead effort to determine improvement areas to stabilize the plant.

* Identify risks, responsive, and works with a sense of urgency plus works within a team or independently

* Test and tune network, hardware, and software configurations to maximize performance

* Ability to interface with different teams like IT Dev managers, Infrastructure teams and lead as a Subject Matter Expert (SME) for the application(s) supported.

* Understand the overall business flow of supported application systems and its interface with clients

* Taking ownership and managing production requests, questions, issues and perform Root Cause Analysis for outages/incidents

* Understand the overall business flow of supported application systems and its interface with clients

* Be flexible to provide weekend on call rotation and available for offshore time lead

* Within the Application Support space, to be accountable for the Production Environments as well as the non-Production Environments for the existing GBT team and be part of 24/7 production support coverage.

Skills Required:

* 5+ years of experience in a production environment with a solid software development background and understanding of performance tuning, end-to-end troubleshooting, networking fundamentals and appropriate attention to detail.

* Ability to focus, provide resolutions for production issues in a high demanding and pressured environment

* Requires experience in designing, developing, and implementing technical solutions, or significant experience in deep technical support

* Strong experience in scripting language (Shell scripting, Python, Perl, etc., ) and cloud driven development

* Strong database skills with DB2, Sybase or Oracle

* Hands-on experience with Autosys or other batch scheduling software

* Strong experience in Continuous Integration and Continuous deployment

* Strong experience in environment on demand for both Virtual Machines and containers

* Knowledge and hands-on experience on with monitoring tools like Splunk, IP Soft, Sockeye

* Practical experience on Agile Methodology (e.g. Scrum)

* Knowledge or experience with automating deployments using Jenkins, Train or Windeploy

* Ability to diagnose technical problems, debug, optimize code, and automate routine tasks

* Hands-on experience in application and database troubleshooting/issue resolution in a fast-paced environment

* Excellent communication and ability to think out of the box for process improvements.

* Bachelor's/Master's Degree in Computer Science, Information Systems or related field

Skills Desired:

* Knowledge of Cloud based deployment, security, networking concepts in Azure and AWS

* Knowledge or experience with algorithms, data structures, complexity analysis and software design

* Interest in designing, analyzing and troubleshooting large-scale distributed systems. .

Job Requirements

Similar Jobs