Issue #047

Hey folks, welcome to another installment of Monitoring Weekly! Did you write something about monitoring recently? Maybe got an idea rolling around in your head? Send it on over and let the community learn from you. ๐Ÿ˜€

Monitoring News, Articles, and Blog posts
Oncall and Sustainable Software Development

On-call can be pretty awful. But does it have to be? What if there was a better way?

Getting helpful CloudWatch alarms in Slack

CloudWatch alarms kinda stink, as the author of this article points out. They often don’t contain what you want to know, have some esoteric references in them, and tend to expect that the reader fully understands the underpinnings of the architecture. All pretty big asks. The author of this post (and code!) walks us through how his team improved the alerts they’re getting from CloudWatch using SNS and a Lambda function. Code is included in the post.

From a Time-Series Database to a Key Operational Technology for the Enterprise: Part I

This monster post, and only Part 1 of a 3-part series, begins to explore what it takes to move time series pipelines from mere app monitoring to an enterprise service, capable of being used effectively in larger scale industrial and manufacturing environments.

Scale up, speed up: Join us at Dash in July!

Datadog is putting on their first user conference in New York City this July. Looks like it should be a great event. The CFP is open until March 16th.

Release ElastiFlow v2.0.0 ยท robcowart/elastiflow

I first mentioned the ElastiFlow plugin back in March and big of a fan I am. Now, v2 is available with a ton more features and improvements. Most notably, IPFIX and sFlow support.

Why, as a Netflix infrastructure manager, am I on call?

Why be on-call if you don’t have to be? The author makes a few compelling reasons for why they’re still on-call as a manager.

See you next week!

— Mike (@mike_julian) Monitoring Weekly Editor