Evaluate your SIEM
Get the guideComplete visibility for DevSecOps
Reduce downtime and move from reactive to proactive monitoring.
November 7, 2024
If you are moving in the observability circles, chances are that you have heard the phrase “Observability 2.0,” which refers to how we need a new approach to observability. I am incredibly excited about the energy and discussion around a shift to “Observability 2.0,” as we now have a second chance to develop observability the way it was originally envisioned.
With that said, we need to be careful not to fall into the same traps of the past and only iterate on limited, monitoring technologies and methodologies that don’t scale in a dynamic, digital world. We need to chart a new path forward that will allow us to derive insights and immediately act within an always-changing universe.
Looking back now, when Peter Bourgon mapped out observability as a Venn diagram of metrics, tracing, and logging, we should have known market vendors would quickly hijack the discussion. Although many industry thought leaders and veterans did their best to clarify these concepts, the noise drowned those voices. The end result was that monitoring essentially became misassociated as observability.
Today, organizations that adopt “Observability 1.0” solutions come to realize two immediate pitfalls:
Complex, interdependent systems behave in unpredictable ways, no matter how much you instrument the code upfront. Distributed tracing has its place, but it can quickly morph into a “whack a mole” problem for the engineering team—every time the code, environment, and/or third-party services change, the team needs to retrofit the code with new tracing.
Modern organizations iterate on microservices, often releasing hundreds or even thousands of code changes a day. It’s impossible to continuously instrument code at that scale. Monitoring tools as they exist today would be fine if the world was static. But in an environment that’s constantly changing, those solutions simply can’t keep up.
Thomas Johnson explains in a recent article, "... The building block of observability 2.0 is log events, which are more powerful, practical, and cost-effective than metrics (the workhorse of observability 1.0)."
I couldn't have put this better myself. The interesting part of the insight is that it is both a true statement, and an oxymoron.
Metrics were originally seen as the cost effective method to monitor an application and its supporting services and infrastructure components. This is because a metric is sampled data, which should happen periodically. When using it for observability, the frequency of that sampling and aggregating can increase so dramatically, it can exceed the cost of the infrastructure running the application! Our customers commonly refer to this problem as the “Untennable cost of observability.”
Metrics weren’t designed for granularity, but more of a pulse check. For deeper, contextualized observability, we need a single source of truth across exponentially complex and changing applications. Metrics or traces can’t provide that, the former not granular enough and the latter too narrow to capture constantly moving applications and their problems.
Ultimately, the only way to gain comprehensive visibility is to capture all the data exhaust that’s naturally emitted from your applications and infrastructure–the atomic level of logs. Like Thomas Johnson highlighted, structured and unstructured logs are powerful and practical, giving the deep insights that technical teams need to truly understand the inner workings of their systems.
At Sumo Logic, our platform is cloud-native and with schema on demand, it can analyze all the structured and unstructured log data across any enterprise in real-time.
We’ve always had these traits; they’re inherent in the system, like our technology’s DNA. Much like DNA, it’s so core to who we are that we sometimes forget it’s unique and different. Integrating AI into our DNA paves the way for this dynamic approach to observability, one that’s designed for applications and infrastructure that constantly change.
This new approach isn’t concerned with telemetry types, but about removing blind spots entirely and making sure digital teams have what they need to deliver on the outcomes of observability. Applications that are reliable, secure, optimized, and prepared for the hardest challenges the future may hold and a system that accelerates that without burdening your limited resources.
I’m excited about this new era of dynamic observability, and to see what other vendors bring to the table. More than that, I’m looking forward to sharing what is possible when we tech preview our vision at AWS re:Invent.
Visit us at re:Invent and book your demo now. Not going? Discover how cloud-scale log data is powering AI at Samsung.
Reduce downtime and move from reactive to proactive monitoring.
Build, run, and secure modern applications and cloud infrastructures.
Start free trial