Hey folks, welcome to another installment of Monitoring Weekly! Did you write something about monitoring recently? Maybe got an idea rolling around in your head? Send it on over and let the community learn from you. :D
Monitoring News, Articles, and Blog posts
Health Score Metrics as a Software Craftsmanship Enabler
Kind of a tangential area, but LinkedIn wrote about tools/processes they use for monitoring (and encouraging!) the “health” of their own software development over time by creating a sort of index number based on such metrics as test coverage, style/lint coverage, dependency freshness, and others. The result is a single number that, ideally, encourages teams to improve upon.
I love reading about other people’s monitoring stacks and this writeup from the folks at OLX Group is certainly a good read. It’s a pretty complex setup with a bunch of tools strung together, including some I haven’t seen in a while (well hello there Brubeck!) and some new ones (moira).
These two articles are a two-parter on building Telegraf plugins yourself. I really like how straightforward they are. Well, assuming you already know Golang, that is.
Splunk just launched support for the handling of metrics, with built-in inputs for statsd and collectd. Sadly, Splunk has never been good at metrics–it’s just not in their DNA as a company. It should be interesting to see how this works out, and it’s certainly one more check mark for those that try to use Splunk for everything under the sun (for better or worse). I’m really curious what the impact on the daily ingestion volume will be with metrics ingestion at-scale.
|**[InfluxDays||Software Tools & Technology Conference for building time series based apps](https://influxdays.com/)**|
InfluxData, the folks behind Telegraf, InfluxDB, Chronograf, and Kapacitor (aka, the “TICK stack”) are putting on their first conference in San Francisco on November 14th. It’s a single day, but I’m liking the lineup. I’ll be there covering the event and reporting back for all those who can’t make, but for those who are coming–come find me and say hi!
This talk from SREcon17 EU is a really great explanation of incident management procedures and how Shopify makes them way less cumbersome by leveraging a Lita-based chatbot. It’s pretty neat stuff–check it out.
See you next week!
– Mike (@mike_julian) Monitoring Weekly editor