Site Reliability Engineer, Edge Services - USDS
Apply NowCompany: TikTok
Location: Seattle, WA 98115
Description:
Responsibilities
Team Insight:
CDN Site Reliability Engineering combines software and network engineering with system operations to build and run large-scale, massively distributed infrastructure. Our Edge SREs ensure infrastructure services are reliable, fault-tolerant, efficiently scalable and cost-effective. We dive deep into the stack, including network, OS, and applications, to quickly resolve complex functional and performance issues.
In order to enhance collaboration and cross-functional partnerships, among other things, at this time, our organization follows a hybrid work schedule that requires employees to work in the office 3 days a week, or as directed by their manager/department. We regularly review our hybrid work model, and the specific requirements may change at any time.
Responsibilities:
- Architect and implement solutions that enable both internal and external customers to harness the power of TikTok's content delivery network.
- Contribute to data pipelines, tools, automations, visualizations and monitors to facilitate the operation and optimization of edge services.
- Data monitoring and alerting, data quality assurance and anomaly detection.
- Document team processes and policies, including methods of engagement and SLOs.
- Analyze, design and implement solutions at the system level to remove bottlenecks and improve edge service performance.
- Implement monitoring and alerting to improve issue detection and response.
- Work in a fast-paced environment. Participate in technical operations and rotations in response to performance and reliability issues.
Qualifications
Minimum Qualifications:
- Bachelor's degree with 2+ years of experience in Computer Engineering, Computer Science, or related fields, or equivalent experience.
- 2+ years working experience in the field of CDN performance and traffic engineering, network solution architecting or network-focused site reliability engineering roles.
- Experience in networking technologies such TCP/IP, BGP, DNS, etc. in a carrier-grade environment. Past experience with CDN technologies.
- 2+ years experience in one or more programming languages such as Java, C++, Go, or scripting experience in Shell and Python.
- Strong analytical skills and the ability to solve real world problems in a fast moving environment.
Preferred Qualifications:
- Experience in operating in a multi-CDN environment.
- Understanding of IPv6 and IPv4-IPv6 coexistence technologies.
- Self-driven and capable of working with ambiguity and moving projects from concept to delivery.
- Experience in designing, analyzing and building automation and tools for large scale systems.
Candidates for this position must be legally authorized to work in the United States. This position is not eligible for visa sponsorship or support.
Team Insight:
CDN Site Reliability Engineering combines software and network engineering with system operations to build and run large-scale, massively distributed infrastructure. Our Edge SREs ensure infrastructure services are reliable, fault-tolerant, efficiently scalable and cost-effective. We dive deep into the stack, including network, OS, and applications, to quickly resolve complex functional and performance issues.
In order to enhance collaboration and cross-functional partnerships, among other things, at this time, our organization follows a hybrid work schedule that requires employees to work in the office 3 days a week, or as directed by their manager/department. We regularly review our hybrid work model, and the specific requirements may change at any time.
Responsibilities:
- Architect and implement solutions that enable both internal and external customers to harness the power of TikTok's content delivery network.
- Contribute to data pipelines, tools, automations, visualizations and monitors to facilitate the operation and optimization of edge services.
- Data monitoring and alerting, data quality assurance and anomaly detection.
- Document team processes and policies, including methods of engagement and SLOs.
- Analyze, design and implement solutions at the system level to remove bottlenecks and improve edge service performance.
- Implement monitoring and alerting to improve issue detection and response.
- Work in a fast-paced environment. Participate in technical operations and rotations in response to performance and reliability issues.
Qualifications
Minimum Qualifications:
- Bachelor's degree with 2+ years of experience in Computer Engineering, Computer Science, or related fields, or equivalent experience.
- 2+ years working experience in the field of CDN performance and traffic engineering, network solution architecting or network-focused site reliability engineering roles.
- Experience in networking technologies such TCP/IP, BGP, DNS, etc. in a carrier-grade environment. Past experience with CDN technologies.
- 2+ years experience in one or more programming languages such as Java, C++, Go, or scripting experience in Shell and Python.
- Strong analytical skills and the ability to solve real world problems in a fast moving environment.
Preferred Qualifications:
- Experience in operating in a multi-CDN environment.
- Understanding of IPv6 and IPv4-IPv6 coexistence technologies.
- Self-driven and capable of working with ambiguity and moving projects from concept to delivery.
- Experience in designing, analyzing and building automation and tools for large scale systems.
Candidates for this position must be legally authorized to work in the United States. This position is not eligible for visa sponsorship or support.