DevOps/Site Reliability Engineer
Apply NowCompany: Seven Seven Software
Location: Alpharetta, GA 30022
Description:
Responsibilities:
* Maintain applications once they are live by measuring and monitoring availability, latency and overall system health with a focus on business activities and continuously evaluate cost and waste
* Engage in and improve the whole lifecycle of services from inception and design, through deployment, operation, capacity planning and launch reviews.
* Scale systems sustainably through mechanisms like automation and evolve systems by pushing for changes that improve reliability and velocity; includes automation for other various operational needs.
* Troubleshooting infrastructure issues, reviewing log files, updating documentation, and having knowledge base with resolutions
* Work closely with the application Development team to understand the platform and create tools/utilities to help with production management
* Working with upstream data providers and upstream consumers, and reducing the amount of escalation to development teams
* Develop scripts and assist with code changes along with operational tasks/activities.
* Work closely with Application Development to ensure that the support team has excellent knowledge of the application set, own and maintain support knowledgebase and documents
* Use of analytical skills to find trends in the environment and drive out problems.
* Lead effort to determine improvement areas to stabilize the plant.
* Identify risks, responsive, and works with a sense of urgency plus works within a team or independently
* Test and tune network, hardware, and software configurations to maximize performance
* Ability to interface with different teams like IT Dev managers, Infrastructure teams and lead as a Subject Matter Expert (SME) for the application(s) supported.
* Understand the overall business flow of supported application systems and its interface with clients
* Taking ownership and managing production requests, questions, issues and perform Root Cause Analysis for outages/incidents
* Understand the overall business flow of supported application systems and its interface with clients
* Be flexible to provide weekend on call rotation and available for offshore time lead
* Within the Application Support space, to be accountable for the Production Environments as well as the non-Production Environments for the existing GBT team and be part of 24/7 production support coverage.
Skills Required:
* 5+ years of experience in a production environment with a solid software development background and understanding of performance tuning, end-to-end troubleshooting, networking fundamentals and appropriate attention to detail.
* Ability to focus, provide resolutions for production issues in a high demanding and pressured environment
* Requires experience in designing, developing, and implementing technical solutions, or significant experience in deep technical support
* Strong experience in scripting language (Shell scripting, Python, Perl, etc., ) and cloud driven development
* Strong database skills with DB2, Sybase or Oracle
* Hands-on experience with Autosys or other batch scheduling software
* Strong experience in Continuous Integration and Continuous deployment
* Strong experience in environment on demand for both Virtual Machines and containers
* Knowledge and hands-on experience on with monitoring tools like Splunk, IP Soft, Sockeye
* Practical experience on Agile Methodology (e.g. Scrum)
* Knowledge or experience with automating deployments using Jenkins, Train or Windeploy
* Ability to diagnose technical problems, debug, optimize code, and automate routine tasks
* Hands-on experience in application and database troubleshooting/issue resolution in a fast-paced environment
* Excellent communication and ability to think out of the box for process improvements.
* Bachelor's/Master's Degree in Computer Science, Information Systems or related field
Skills Desired:
* Knowledge of Cloud based deployment, security, networking concepts in Azure and AWS
* Knowledge or experience with algorithms, data structures, complexity analysis and software design
* Interest in designing, analyzing and troubleshooting large-scale distributed systems. .
Job Requirements
* Maintain applications once they are live by measuring and monitoring availability, latency and overall system health with a focus on business activities and continuously evaluate cost and waste
* Engage in and improve the whole lifecycle of services from inception and design, through deployment, operation, capacity planning and launch reviews.
* Scale systems sustainably through mechanisms like automation and evolve systems by pushing for changes that improve reliability and velocity; includes automation for other various operational needs.
* Troubleshooting infrastructure issues, reviewing log files, updating documentation, and having knowledge base with resolutions
* Work closely with the application Development team to understand the platform and create tools/utilities to help with production management
* Working with upstream data providers and upstream consumers, and reducing the amount of escalation to development teams
* Develop scripts and assist with code changes along with operational tasks/activities.
* Work closely with Application Development to ensure that the support team has excellent knowledge of the application set, own and maintain support knowledgebase and documents
* Use of analytical skills to find trends in the environment and drive out problems.
* Lead effort to determine improvement areas to stabilize the plant.
* Identify risks, responsive, and works with a sense of urgency plus works within a team or independently
* Test and tune network, hardware, and software configurations to maximize performance
* Ability to interface with different teams like IT Dev managers, Infrastructure teams and lead as a Subject Matter Expert (SME) for the application(s) supported.
* Understand the overall business flow of supported application systems and its interface with clients
* Taking ownership and managing production requests, questions, issues and perform Root Cause Analysis for outages/incidents
* Understand the overall business flow of supported application systems and its interface with clients
* Be flexible to provide weekend on call rotation and available for offshore time lead
* Within the Application Support space, to be accountable for the Production Environments as well as the non-Production Environments for the existing GBT team and be part of 24/7 production support coverage.
Skills Required:
* 5+ years of experience in a production environment with a solid software development background and understanding of performance tuning, end-to-end troubleshooting, networking fundamentals and appropriate attention to detail.
* Ability to focus, provide resolutions for production issues in a high demanding and pressured environment
* Requires experience in designing, developing, and implementing technical solutions, or significant experience in deep technical support
* Strong experience in scripting language (Shell scripting, Python, Perl, etc., ) and cloud driven development
* Strong database skills with DB2, Sybase or Oracle
* Hands-on experience with Autosys or other batch scheduling software
* Strong experience in Continuous Integration and Continuous deployment
* Strong experience in environment on demand for both Virtual Machines and containers
* Knowledge and hands-on experience on with monitoring tools like Splunk, IP Soft, Sockeye
* Practical experience on Agile Methodology (e.g. Scrum)
* Knowledge or experience with automating deployments using Jenkins, Train or Windeploy
* Ability to diagnose technical problems, debug, optimize code, and automate routine tasks
* Hands-on experience in application and database troubleshooting/issue resolution in a fast-paced environment
* Excellent communication and ability to think out of the box for process improvements.
* Bachelor's/Master's Degree in Computer Science, Information Systems or related field
Skills Desired:
* Knowledge of Cloud based deployment, security, networking concepts in Azure and AWS
* Knowledge or experience with algorithms, data structures, complexity analysis and software design
* Interest in designing, analyzing and troubleshooting large-scale distributed systems. .
Job Requirements