This issue is sponsored by:
Everyone wants stats-based alerting, but it’s not always straightforward to do. InfluxDB’s Holt-Winters support is pretty great though, and easy to use. Learn more about it here.
My thanks to InfluxData for their support of Monitoring Weekly.
From The Community
I love this article, especially how it starts out: what, exactly, does accuracy mean? From the article: “But what is accuracy then? Is the algorithm expected to be 73% accurate when it reports a 73% confidence? What does accuracy mean in this situation? Does accuracy mean the number of answers with more than 50% confidence that were correct? Is 50% the right threshold? How do we count the null answers? What if both yes and no have less than the required confidence? How is that counted?””
The folks at SemaText have written a pretty great five-part series on OpenTracing.
From the article: “This post will show you some tools and techniques for managing Golang logs. We’ll begin with the question of which logging package to use for different kinds of requirements. Next, we’ll explain some techniques for making your logs more searchable and reliable, reducing the resource footprint of your logging setup, and standardizing your log messages.”
Super neat visualization for understanding communication patterns in distributed systems, but if this doens’t make you start regretting breaking up your monolith, I don’t know what will.
I don’t normally link to what amounts to “how to use our product” articles, but given that roughly 99% of you probably use PagerDuty, it’s relevant. Especially because the organization and setup isn’t quite as intuitive or obvious as it probably could be.
Because outages can also be beautiful.
From the site: “It is a bash script that uses your current kubectl context to interactively select namespaces and multiple pods to download logs from. It basically runs kubectl logs in a loop for all containers, redirecting the logs to local files.”
Just a recap and a link to all of the videos from the recent GrafanaCon in Los Angeles.
From my friend Thai Wood over at Resilience Roundup, he walks us through how on-call functions at NASA.
If you’re wondering how good your SLOs are, you should watch this.
This issue is sponsored by:
We’re hosting an online workshop on Tue 3/26 at 9:00am PT on building, deploying and monitoring containers. Sylvia Fronczak (Software Engineer) and Dave McAllister (Scalyr Community Guy) will show live code and examples to accompany container orchestration concepts. They’ll also show how to get started with monitoring containers. Sign up for the online workshop.
For the LogicMonitor fans among you, the LogicMonitor Lever Up conference is this June. I’ll be speaking, so come hang out/heckle!
See you next week!
– Mike (@mike_julian) Monitoring Weekly Editor