Site Reliability Engineer
Apply NowCompany: V2 Innovations
Location: Alpharetta, GA 30022
Description:
As a Site Reliability Engineer, you will play a pivotal role in maintaining the operational integrity and reliability of our systems. Your responsibilities encompass running the production environment, building software and systems for platform infrastructure, and optimizing system performance for our suite of software solutions. By leveraging your expertise, you will enhance the reliability, quality, and time-to-market of our offerings, pushing the boundaries of customer satisfaction and innovation.
Key Objectives:
Responsibilities:
Required Skills and Qualifications:
Preferred Skills and Qualifications:
If you are a proactive and skilled Site Reliability Engineer with a passion for enhancing system reliability and performance, we encourage you to apply. This role offers the opportunity to make a significant impact on our technology stack and services.
Key Objectives:
- Monitor system availability and overall health, taking a holistic approach.
- Develop software and systems to manage platform infrastructure and applications.
- Enhance system reliability, quality, and time-to-market for software solutions.
- Optimize system performance, anticipating customer needs and fostering continuous improvement.
Responsibilities:
- Create observability and operations dashboards through integrated monitoring, logging, and observability data.
- Analyze metrics from operating systems and applications to fine-tune performance and troubleshoot issues.
- Demonstrate expertise in Dynatrace and its APIs for monitoring, analyzing, and resolving application problems.
- Configure Dynatrace for anomaly thresholds, alerting, and integration with notification channels.
- Utilize Dynatrace extensions for customized monitoring solutions.
- Collaborate with development teams to improve services through rigorous testing and release processes.
- Engage in system design consulting, platform management, and capacity planning.
- Automate systems and services to ensure sustainability and reliability.
- Strike a balance between feature development speed and reliability with defined service-level objectives.
Required Skills and Qualifications:
- Bachelor's degree (or equivalent) in computer science or a related discipline.
- Proficiency in programming using high-level languages such as Python, Java, C/C++, Ruby, or JavaScript.
- Experience with distributed storage technologies like NFS, HDFS, Ceph, Azure Blob, or Amazon S3.
- Familiarity with dynamic resource management frameworks like Apache Mesos, Kubernetes, and Yarn.
- Proactive approach to identifying problems, performance bottlenecks, and areas for enhancement.
Preferred Skills and Qualifications:
- Demonstrated success in technical engineering roles.
- Proficiency in coding beyond basic scripts, showcasing your technical depth and capabilities.
If you are a proactive and skilled Site Reliability Engineer with a passion for enhancing system reliability and performance, we encourage you to apply. This role offers the opportunity to make a significant impact on our technology stack and services.