Shaun Abram
Technology and Leadership Blog
Testing in Production Presentation – SVCC 2018
The following post is essentially a written version of the Testing in production talk I gave at Silicon Valley Code Camp 2018. You can find the presentation deck here at slideshare.
Tags: chaos, chaosengineering, conferencetalks, For prep, integrationtesting, itestinprod, mytalks, production, resilience, resilienceengineering, Testing, testinginproduction
Book summary: Distributed Systems Observability
“Distributed Systems Observability” is a book from Cindy Sridharan (find her on twitter, and medium), available as a free download here (registration required). At a little over 30 pages and 8,000 words, it is not a difficult read, and I definitely recommend it.
Tags: itestinprod, logs, metrics, observability, production, summary, Testing, testinginproduction, tracing
Testing in Production
Note that I gave a talk on this blog post in Dec ’18, if you prefer to watch that: https://www.youtube.com/watch?v=-b4QaEuFkP0
“Testing in production” used to be a joke. The implication was that by claiming to test in production, you didn’t really test anywhere, and instead just winged it: deploying to production and hoping that it all worked. Times have changed however, and testing in production is becoming accepted as a best practice.
Tags: integrationtesting, itestinprod, production, talks, Testing, testinginproduction
Book summary: Chaos Engineering
“Chaos Engineering” is a book from O’Reilly (free download), written by folks from the “The Chaos team” at Netflix. It is a GREAT read for anyone interested in resilience engineering. This post is one of my summaries, essentially a cut and paste of the most salient parts (the original is about 16,000 words; this is about 3,000), with some paraphrasing and merging/rewriting of sections for brevity.
Tags: chaos, chaosengineering, netflix, resilience, resilienceengineering, summary, testinginproduction
Post Production Debugging
Monitoring and Observing Your App Post Release
Pre-release tests are essential, but the ability to debug, monitor and observe your application suite post-release is what allows you to detect, and quickly fix, the production problems that will inevitably rise.
Tags: apdex, KPIs, monitoring, mttd, mttf, mttr, o11y, observability, production, SLAs, testinginproduction