Staff Software Engineer
Apply NowCompany: Cadre
Location: San Francisco, CA 94112
Description:
Staff Software Engineer
Join us at Cleric
We're building an autonomous AI SRE that helps software engineering teams reliably investigate production incidents. Our agent combines LLMs with tools to understand systems, reason through problems, and take corrective actions - even for issues it hasn't encountered before. Our mission is to let engineers focus on building products, not fighting fires.
We're a small team of AI and infrastructure veterans backed by leading AI investors. Cleric is already in production at high scale companies and saving engineers hundreds of hours in investigations.
About the role
As Staff Software Engineer at Cleric, you'll build the platforms that enable our product to operate at scale across multiple customers. You'll design and implement the infrastructure for agent deployment, execution, evaluation, and learning - creating systems that let us run our agents efficiently and reliably.
You'll architect systems for collecting production data, feeding it into our simulation environments, and using it to train agents at scale. This includes building deployment pipelines, scaling mechanisms for handling increased load, and integration frameworks for diverse customer environments.
Beyond the technical implementation, you'll establish engineering patterns and best practices that help our team maintain high quality as we grow. You'll mentor other engineers, provide technical direction, and ensure we're making pragmatic architectural decisions that balance immediate needs with long-term scalability.
You'll have significant technical autonomy in designing and implementing these systems, working closely with our founding team to expand our platform capabilities while maintaining high engineering standards.
What you'll do
- Build and scale our agent platform, evaluation systems, and the web applications where users manage their Cleric deployment
- Design data collection and processing pipelines that power our training environments
- Implement monitoring and observability systems to observe platform and agent performance
- Create testing frameworks and development patterns to maintain engineering quality
- Build APIs, tools, and libraries that help our team ship quickly and reliably
- Establish engineering best practices and mentor other engineers as we grow
- Scale our infrastructure and systems to support rapid customer growth
You have
- 6+ years of production software engineering experience
- Strong software engineering fundamentals with focus on simplicity and maintainability
- Deep experience with observability tools and practices (Datadog, OpenTelemetry)
- Track record of building reliable, scalable systems
- Experience with cloud infrastructure (GCP, AWS) and containerization
- Strong opinions about software engineering practices
- Experience being on-call and handling production incidents
- Ability to challenge assumptions and propose pragmatic solutions
- Curiosity and drive to learn new technologies
Nice to have
- Experience with LLM-based systems
- Background in building developer platforms
- Previous startup experience
How we work
- Small teams, big impact: We believe that small teams can deliver great products
- Culture matters: We value radical candor in a positive and inclusive work environment
- In-person collaboration: We believe in working closely to deliver the best results
- AI-first approach: We don't simply build AI products; we augment ourselves with it
Interview process (you'll meet most of the team via the process)
1. Intro Call
- Discuss your experience, the company, product, and the role
2. Software Engineering Session (1 hour)
- Collaboratively build an application
- Focus on practical software engineering, not algorithm challenges
3. System Design Session (90 mins)
- Work through a system design problem relevant to your daily work
4. Product thinking and engineering practices (60 mins)
- Talk about your perspectives on building a great product
- Deep dive on engineering practices and culture
Join us at Cleric
We're building an autonomous AI SRE that helps software engineering teams reliably investigate production incidents. Our agent combines LLMs with tools to understand systems, reason through problems, and take corrective actions - even for issues it hasn't encountered before. Our mission is to let engineers focus on building products, not fighting fires.
We're a small team of AI and infrastructure veterans backed by leading AI investors. Cleric is already in production at high scale companies and saving engineers hundreds of hours in investigations.
About the role
As Staff Software Engineer at Cleric, you'll build the platforms that enable our product to operate at scale across multiple customers. You'll design and implement the infrastructure for agent deployment, execution, evaluation, and learning - creating systems that let us run our agents efficiently and reliably.
You'll architect systems for collecting production data, feeding it into our simulation environments, and using it to train agents at scale. This includes building deployment pipelines, scaling mechanisms for handling increased load, and integration frameworks for diverse customer environments.
Beyond the technical implementation, you'll establish engineering patterns and best practices that help our team maintain high quality as we grow. You'll mentor other engineers, provide technical direction, and ensure we're making pragmatic architectural decisions that balance immediate needs with long-term scalability.
You'll have significant technical autonomy in designing and implementing these systems, working closely with our founding team to expand our platform capabilities while maintaining high engineering standards.
What you'll do
- Build and scale our agent platform, evaluation systems, and the web applications where users manage their Cleric deployment
- Design data collection and processing pipelines that power our training environments
- Implement monitoring and observability systems to observe platform and agent performance
- Create testing frameworks and development patterns to maintain engineering quality
- Build APIs, tools, and libraries that help our team ship quickly and reliably
- Establish engineering best practices and mentor other engineers as we grow
- Scale our infrastructure and systems to support rapid customer growth
You have
- 6+ years of production software engineering experience
- Strong software engineering fundamentals with focus on simplicity and maintainability
- Deep experience with observability tools and practices (Datadog, OpenTelemetry)
- Track record of building reliable, scalable systems
- Experience with cloud infrastructure (GCP, AWS) and containerization
- Strong opinions about software engineering practices
- Experience being on-call and handling production incidents
- Ability to challenge assumptions and propose pragmatic solutions
- Curiosity and drive to learn new technologies
Nice to have
- Experience with LLM-based systems
- Background in building developer platforms
- Previous startup experience
How we work
- Small teams, big impact: We believe that small teams can deliver great products
- Culture matters: We value radical candor in a positive and inclusive work environment
- In-person collaboration: We believe in working closely to deliver the best results
- AI-first approach: We don't simply build AI products; we augment ourselves with it
Interview process (you'll meet most of the team via the process)
1. Intro Call
- Discuss your experience, the company, product, and the role
2. Software Engineering Session (1 hour)
- Collaboratively build an application
- Focus on practical software engineering, not algorithm challenges
3. System Design Session (90 mins)
- Work through a system design problem relevant to your daily work
4. Product thinking and engineering practices (60 mins)
- Talk about your perspectives on building a great product
- Deep dive on engineering practices and culture