This is an important point to drive home, so bear with me on this rant…
Monitoring involves tracking telemetry data over time. Which means, recording data as a time series allows for the continuous observation of changes and trends. One can then associate these data points with specific timestamps; making it easier to correlate events and activities with system performance, and therefore enabling better diagnosis of issues.
Monitoring data needs to be recorded as a time series because the temporal aspect is crucial for analyzing, understanding, and acting upon system performance and behavior. Time series data enables trend analysis, anomaly detection, capacity planning, root cause analysis, alerting, automation, visualization, and compliance reporting, all of which are essential components of effective monitoring.
OpenSource has steered monitoring towards TSDBs a.k.a Prometheus. But that has turned out to be a massive issue. (One is only postponing the inevitable by adopting a typical TSDB built for the 2000s)
Let me explain.
The banes of being OpenSource
Databases are typically used for transactional processing (OLTP - Online Transaction Processing). They are designed to handle a large number of short transactions such as insertions, updates, and deletions.
Question: When collecting monitoring data do we use it for OLTP? No. Not really.
We actually need it for OLAP (Online Analytical Processing). This is where a Data Warehouse shines. Data Warehouses are designed to handle large volumes of data and complex queries to support business intelligence and decision-making processes.
Monitoring data needs to be denormalized and optimized for read performance.
Guess what? That is precisely what a Data Warehouse offers, as opposed to a Database where data is typically normalized to reduce redundancy and improve data integrity.
The first principles of your monitoring stack
Understanding the first principles of data usage is the first step in taking control of your monitoring stack. If you fail to differentiate between these fundamentals, you stand to bear the brunt of them.
This is also why most folks, especially large ones, struggle with their monitoring. Once you scale, taking control of your monitoring gets harder because of base-level choices on the foundation of a monitoring stack.
For some of the larger companies we work with, a TSDB is pointless. In fact, i would contend that a TSDB should be a relic of the past for most orgs that are witnessing hockey-stick growth curves.
A Data Warehouse that helps you segment data based on needs and teams is also able to be highly performant, and significantly reduce costs. At a time when software monitoring is not well understood, and costs are spiralling out of control, it’s time folks took a hard look at these archaic practices.
For example, Last9’s Levitate comes with Blaze, Hot, and Cold tiers to help with fast queries and better cost management. This seems like an obvious need in hindsight, but most folks at the battlefront of their day job forget the need. (Or maybe because not many people have taken the time to step back and question how a monitoring stack should perform)
Data Tiering is critical, and the need of the hour. It’s lacking in a Time Series Database and desperately needs better solutions.
✌️
 
  
  
  
 