S
SRE (Site Reliability Engineering)
SRE applies software engineering principles to infrastructure and operations for reliable, scalable systems.
What is SRE?
Site Reliability Engineering (SRE) is a discipline that applies software engineering approaches to infrastructure and operations, originating at Google, focused on creating reliable and scalable software systems.
SRE principles
Error budgets, Eliminating toil, Automation, SLOs, Blameless postmortems.
Common misconceptions
- "SRE equals DevOps" — Related but distinct
- "SRE is just operations renamed" — Engineering focus
- "SRE only for large scale" — Principles apply at any scale