Issue #057

Hey folks, welcome to another installment of Monitoring Weekly! Did you write something about monitoring recently? Maybe got an idea rolling around in your head? Send it on over and let the community learn from you. 😀

Monitoring News, Articles, and Blog posts
Kubernetes, Local to Production with Django: 6— Add Prometheus & Grafana Monitoring With Helm

Exactly what it sounds like: how to set up Prometheus and Grafana via Kubernetes Helm.

The Super Logger, your new favorite hero

This looks like a great tool for solving logging pain points with Node.js, though it’s also reminding me why I never want write any Node.js at all. For those of you without that choice, check this out.

Kubernetes PromQL (Prometheus) CPU aggregation walkthrough

Prometheus’s PromQL can be daunting for those new to it. The author, being new to PromQL, wrote up their notes with a small task: aggregating CPU by pod label. These notes certainly make PromQL much more approachable.

Grafana v5.1 Released

Lots of awesome stuff in this release: heatmap support for Prometheus data, Microsoft SQL Server as a data source, and more.

MacOS monitoring the open source way

Security monitoring on MacOS at Dropbox using osquery, Google’s Santa, and Apple’s OpenBSM. There is also a fantastic discussion on Hacker News about the user privacy implications of this setup (among other topics).

Google: Addressing Cascading Failures

What I really love about this article is how the author goes into detail on what resource exhaustion looks like and how it impacts various metrics.

Incident Management at Netflix Velocity

David Hahn, one of the handful of people on the Netflix SRE (“CORE”) team, gave this talk last year at QCon SF, and it bubbled back up this week. It is a fantastic talk, not the least of which is that David is a wonderful speaker too.

See you next week!

— Mike (@mike_julian)
Monitoring Weekly Editor