Site Reliability Engineer

Apply Now

Company: Cynet Systems

Location: Phoenix, AZ 85032

Description:

Job Description:

Required Skills:
  • 3-5 years of service reliability/operation experience running large-scale, high-performance applications in a hybrid environment (on-prem and cloud).
  • 3-5 years of experience writing automation scripts and building dashboards for application performance management to manage transaction journeys.
  • 2-4 years of experience working with programming languages such as Go, Python, Java, Rust, etc.
  • Working knowledge of one or more databases: Oracle, SQL Server, Redis, Clickhouse, PostgreSQL, MongoDB, or any time-series databases.
  • t least 2+ years of experience transitioning platforms to the cloud and containerization - GCP, AWS, and Rancher (or Cloud Formation, Azure, and OpenShift).
  • Experience maintaining containerized applications in GKE/RKE/AKE environments.
  • Experience implementing cloud observability using OTEL to enable real-time monitoring, distributed tracing, and incident resolution.
  • Experience working with specific GraphQL frameworks (Apollo, Prisma, Hasura, etc.).
  • Experience using knowledge of networking protocols such as TCP/IP, HTTP, DNS, load balancing, and service mesh to troubleshoot issues in high-pressure situations.
Preferred Skills:
  • Proven experience managing application availability, building creative solutions to manage repetitive activities, improving gating.
  • Working knowledge of monitoring tools - Client, AppDynamics, Grafana/Prometheus, and Dynatrace.
  • Experience with tools like Rally, Confluence, and other CI/CD extenders.
  • Hands-on experience with implementing in-memory caching solutions.
  • Experience with Redis DB is a plus.
  • Excellent debugging skills across a variety of integrated technical platforms on API gateway.
  • Hands-on experience with GCS, Cloud SQL, Spanner, and Firestore.
  • Extensive experience in enterprise-level infrastructure and operations.
  • Experience in high availability and distributed systems, Linux and Windows administration, troubleshooting, and support.
  • Monitor and troubleshoot HashiCorp Vault environments, ensuring minimal downtime and rapid recovery from incidents.
  • Working knowledge of Vertex AI, Gen AI, and BigQuery.

Similar Jobs