Is Apdex useful?

I’ve been trying to figure out what SLOs to define for some services recently, and wondering if Apdex is a useful metric. (See my previous post on the difference between SLIs, SLOs and SLAs)


What are SLIs, SLOs and SLAs? 

Service Level Indicators (SLIs) are metrics that you choose to measure the health and performance of your services. Service Level Objectives (SLOs) are the desired target for those indicators. Service Level Agreements (SLAs) build on this and include the consequences of not meeting those targets. All are fundamental to Site Reliability Engineering.

In this post, I’ll try to explain each in more detail, how they relate to each other, and some examples of each.


SRE vs DevOps

I’m really enjoying the Seeking SRE book. Chapter 12 covers SRE vs DevOps; a community sourced compare and contrast type discussion.

My favorite description is from Thomas Limoncelli, who suggested that:

DevOps engineers focus on the SDLC pipeline with occasional responsibilities for production operations. SREs focus on production operations with occasional responsibilities for the SDLC pipeline.


Book chapter summary: Postmortem Culture, from the SRE Book

I’m really enjoying reading the excellent “SRE Book“. Chapter 15 “Postmortem Culture: Learning from Failure” in particular, really struck a chord with me. The following is a slightly summarized version of it.

TLDR: Failures are inevitable, especially in distributed systems. To learn from them, document in Postmortems, avoiding blame, and share the newly gained learnings across your org.


Talk summary: SRE principles by Tori Wieldt @ AWS re:Invent 2018

I caught a talk by Tori Wieldt at the New Relic booth at AWS re:Invent on “SRE principles”. Even though it was a short talk in the expo hall, rather than a formal scheduled one, it had a ton of good SRE material.


