Description Site Reliability Engineer
A site reliability engineer (SRE) contributes by implementing tools to deploy, monitor, and maintain deployments of our software stack. In addition to building tools for these deployments, an SRE collaborates with the engineering team to debug production deployment issues.An SRE
- Builds tools to deploy, monitor, and maintain deployments.
- Debugs production deployments in collaboration with relevant engineering teams.
- Assists with CI/CD pipelines.
- Manages deployment infrastructure (e.g., EC2).
- Interfaces with enterprise infrastructure teams for enterprise deployments.
- Collaborates with other engineering teams to build scalable infrastructure.
An SRE should have:
- 3+ years of experience, preferably in industry, in software development.
- 3+ years of experience as an SRE or equivalent role.
- Bachelor's degree in CS or comparable experience.
- Experience deploying production code in SaaS and on-prem environments.
An SRE is preferred to have:
- Experience with Kubernetes and Docker.
- Experience with AWS and EC2.
- Experience interfacing with customers to debug on-prem environments.
- Experience with zero-downtime upgrades.
- Experience with production database migrations.
- Experience escalating and resolving critical production issues.