Did you know I have a podcast too? Check it out: Real World Devops
This issue is sponsored by:
It’s a story of how VictorOps builds a DevOps culture centered around accountability and collaboration to build more reliable services and bolster SRE efforts. Written by Jason Hand, it dives into both the technical aspects of monitoring, observability, alerting, etc., as well as cultural aspects of collaboration, workflow transparency, etc.
Latest on monitoring.love
Come hang out with all your fellow Monitoring Weekly readers. I mean, I’m also there, but I’m sure everyone else is way cooler.
From The Community
Namespacing metrics is hard, yo.
Including this because it introduced me to a new tool: Skylight.
This one has absolutely nothing to do with Ops/Engineering…or does it? I’ve been refreshing my knowledge on SLIs and SLOs lately, while also re-reading How To Measure Anything, and something has stuck out to me this time: an SLI is directly correlated to how happy your users are, but how do you measure user happiness? As in the example in this article, what you really want are some leading indicators, not lagging indicators. “Revenue earned” is a lagging indicator, for example, while “shopping carts abandoned” is a good leading indicator.
Spoiler: ELK, Datadog, and a bunch of glue. Perhaps most interesting here is that they’re using AWS’s Elasticsearch for some areas with success.
Two awesome tools, now working together.
|**[Logflare||Tail -f Cloudflare Logs](https://logflare.app/)**|
From the site, “Because Cloudflare doesn’t give you logs unless you’re on an Enterprise plan.” I can’t help but wonder how long until Cloudflare shuts this capability down. In the meantime: neat!
One thing I love about HAProxy is their understatedness. This “introduction” is more in-depth than many “deep dives” I’ve read.
If you were starting to get confused by the silly ‘OpenWhatever’ naming patterns, this article from the folks at Datadog does a great job of explaining what the hell is going on.
The math proof is even included at the end. 💥
VP of Product at Grafana, Tom Wilkie, gave a great talk about their new logging tool, Loki, at FOSDEM 2019 recently.
Following up on their announcement of the postmortem documentation, PagerDuty talks more about how to adopt a learning culture in your org–a crucial, but often-overlooked aspect of improving your portmortem process.
This issue is sponsored by:
Yes, it’s a thing! Blue Medora helps you integrate your on-prem infrastructure and your cloud infrastructure into one place. Rather than making your users learn yet another monitoring tool, Blue Medora acts as a bridge, transparently shipping metrics from your datacenter hardware to monitoring tools of your choice.
See you next week!
– Mike (@mike_julian) Monitoring Weekly Editor