A fun and varied week of articles, with something for everyone. Don’t miss the article from Uber engineering and their discovery for minimizing P99 spikes on CPU workloads. Oh, and there are only a few days left to submit a talk proposal for Monitorama PDX 2022. Hope to see you there!
This issue is sponsored by:
The Promscale team has built a lightweight, easy-to-deploy microservices demo instrumented with OpenTelemetry so you can play around with tracing. The demo also includes 6 pre-built Grafana dashboards to monitor upstream and downstream dependencies, throughput, latency, and error rates. Check out this blog post for a complete walkthrough!
Articles & News on monitoring.love
Come hang out with all your fellow Monitoring Weekly readers. I mean, I’m also there, but I’m sure everyone else is way cooler.
From The Community
What happens when your company grows by leaps and bounds but your infrastructure falls behind? Miro shares their story of scaling up their monitoring architecture and processes to keep up with their data platform.
One engineer’s hot take on what Observability is, or is not. Grab your popcorn.
How Airbnb engineers managed to isolate runtime performance characteristics of their service mesh as a single metric.
An excellent article on how Pinterest has increased the reliability and efficiency of their Kubernetes-based control plane.
A handy guide from Grafana Labs on how Prometheus’ relabeling works and when (and how) to use it effectively.
This isn’t strictly monitoring related, but it feels like a big deal… affecting anyone running CPU-heavy workloads in cgroups. Plus, it’s worth it just to see those P99 spikes go away. 😁
I love this article from Brendan Gregg on why we do (or don’t) choose certain products. Frankly, it feels like the making of a great checklist for any new potential vendor.
If you’re considering RKE2 but already have an investment Elasticsearch and Kibana, here’s a pattern for using them in place of Rancher’s default options.
Observability lessons from a fifty-year-old sitcom.
I don’t really grok why the author chose to write this in the context of API observability, but it’s still a solid standalone article on observability in general. I particularly appreciate the background on monitoring software and principles, discussion on signals, and then catching up with the “state of the art” of observability tooling today.
How one engineer uses Longhorn distributed block storage for their HA Prometheus nodes on Kubernetes.
Monitorama is returning to Portland, OR this summer. The organizers have recently opened up their CFP for a limited number of speaking slots. Deadline for submissions is March 31, 2022.
Ready to lower your AWS bill? Now might be the perfect time for an AWS Cost Optimization project with The Duckbill Group. The Duckbill Group aims for a 15-20% cost reduction in identified savings opportunities through tweaks to your architecture–or your money back. (SPONSORED)
See you next week!
– Jason (@obfuscurity) Monitoring Weekly Editor