This week’s issue is heavy on logging stories, software updates, and remote job postings. Oh, and one of the most unusual applications for Prometheus I’ve seen. Mee-oww. 🙀
This issue is sponsored by:
Start incident response with context to all your alerts in one view
Moogsoft speeds up incident response with dynamic anomaly detection, suppressed alert noise, and correlated insights across all your telemetry data. Go from debugging across multiple tools, screens, and dashboards into a single incident view so you and your teams can take a more proactive approach to reduce MTTR. Sign up for the Moogsoft Free community plan today!
Articles & News on monitoring.love
Come hang out with all your fellow Monitoring Weekly readers. I mean, I’m also there, but I’m sure everyone else is way cooler.
From The Community
How VakıfBank designed and manages their centralized logging infrastructure with minimal loss and delays.
Paypal’s ETL pipeline handles 30-55 billion events each day. With this much streaming data, any downtime has a cascading effect on downstream applications and analytics. Here are some of the critical metrics they monitor to ensure the overall health of the system.
This might be the best article I’ve read on incident response and postmortems in a long while. Read this. Share it.
An honest and transparent look at all of Honeycomb’s incidents and lessons learned during a very busy September. Fantastic article.
How one company leveled up their observability by chucking the default Azure Cloud dashboards and replacing them with Grafana.
Next time someone tells you they’ve “seen some 💩”, just send them this story. There truly is a Prometheus exporter for everything.
If you’ve run NGINX in production, you’re probably familiar with the scarcity of useful metrics in the open source version. The Apache APISIX project (based on NGINX’s underlying network libraries) would like you to consider swapping out NGINX for their project (with improved observability).
Work. Without the hard work.
LogicMonitor empowers teams to spend less time troubleshooting and more time innovating with fully automated infrastructure monitoring and log analysis. AI-powered intelligence automatically detects monitoring resources, surfaces anomalies, and provides root cause analysis across your entire stack. Leave the manual configuration, expensive hardware, and long hours of troubleshooting behind with a free trial of LogicMonitor. (SPONSORED)
I’m grateful that most logging isn’t this complicated, but this is still an interesting solution for processing serverless logs and alerts using OpenFunction and Kafka.
A new version of Tempo is out, with performance improvements, recent traces search, and a new “scalable single binary” operational mode. Oh, and a handful of breaking changes.
Loki is also seeing a new release, with support for out-of-order logs (wait, seriously?) and its own new “scalable simple deployment” mode.
“KubeScrape: An open-source dev tool that provides an intuitive way to view the health, structure, and live metrics of your Kubernetes cluster.”
“Privateer is a lightweight Kubernetes prototyping and monitoring tool developed in Electron.js.”
“Kr8s is a desktop application made for developers that need to monitor and visualize their Kubernetes clusters in a user friendly GUI.”
Ready to lower your AWS bill? Now might be the perfect time for an AWS Cost Optimization project with The Duckbill Group. The Duckbill Group aims for a 15-20% cost reduction in identified savings opportunities through tweaks to your architecture–or your money back. (SPONSORED)
See you next week!
– Jason (@obfuscurity) Monitoring Weekly Editor