Before You Build an Observability Practice: The Questions That Actually Matter

4 hours ago
6 min read

In our experience sitting in the trenches of global IT operations, we have seen a recurring pattern that separates the market leaders from those who are simply burning capital. Most organisations approach observability as a "purchase": a line item on a budget meant to buy a tool that promises to "fix" visibility.

They buy the licenses, deploy the agents, and then wait for the magic to happen. But magic rarely arrives in a box. Instead, they find themselves drowning in alerts, paying astronomical egress fees for logs no one reads, and still facing the same "War Room" chaos when a P1 incident hits.

At Visibility Platforms, we believe that observability is not a project you finish; it is a practice you build. Before you write a single line of code or sign a multi-year vendor contract, there are fundamental, strategic questions that must be answered. If you cannot answer these, you aren't building a practice: you're just buying a very expensive dashboard.

The Economic Anchor: Who is Actually Paying?

It sounds like a simple administrative question, but it is the most pivotal pivot point in your strategy. Who pays for the observability platform? Is it the central IT budget? Is it split across DevOps squads? Or is it tied directly to the business units whose revenue depends on system uptime?

When the "central pot" pays, observability often becomes a bloated, unmonitored cost centre. There is no incentive for individual teams to optimise their telemetry because the "bill" is someone else’s problem. However, when you treat observability as an investment rather than a tax, the conversation shifts.

We encourage our partners to put a tangible value on every data source. If a specific log stream costs £50,000 a year to ingest and store, what is the return? Does it reduce the Mean Time to Resolution (MTTR) for a critical revenue-generating service? Or is it just "nice to have" forensic data for a system that hasn't failed in three years? By forcing the question of "Who pays?", you force a culture of accountability and value.

Defining the Users: The War Room Perspective

Who will be using this practice? If the answer is "everyone," the answer is likely "no one."

A successful observability practice serves specific personas with different needs. The SRE squad needs deep-dive traces to find a memory leak in a microservice. The Product Owner needs to know if a slow checkout process is causing a 5% drop in conversion rates. The Leadership Team needs to know if the overall platform stability is improving month-on-month.

We have spent years in critical incident war rooms, and we know that in the heat of a crisis, you don't need 500 different dashboards. You need a single source of truth that tells you where the fire is and how to put it out. If you haven't defined your users and their "must-have" metrics, you will end up with a toolset that is "a mile wide and an inch deep": plenty of data, but zero actionable insight.

Surgical data ingestion and value filtering

The Quality Trap: Why Ingesting Everything is a Failure

There is a dangerous myth in our industry: "More data equals more visibility." This is fundamentally untrue. In reality, more data often leads to more noise, higher costs, and slower query times.

In our previous discussions on FinOps for Telemetry, we've highlighted the need to be surgical. You must be careful what you ingest. Modern observability platforms often charge by volume, and the "ingest everything" trap is a fast track to budget exhaustion.

We advocate for a "Value on Everything" mindset. Every metric, trace, and log should have a reason for existing in your production environment.

Is it actionable? If an alert triggers, does a human or a system know exactly what to do?
Is it unique? Are you collecting the same data via three different agents?
Is it necessary? Do you need 10ms granularity for a legacy internal HR tool?

By pruning the noise at the source, you make the signal louder and clearer. This isn't just about saving money; it’s about making your engineers more effective. A lean, high-quality dataset allows for faster troubleshooting and a more stable ecosystem.

Measuring Success: Outcomes over Dashboards

How do you measure the success of an observability practice? Many teams point to "100% agent coverage" or "50 new dashboards created." In our view, these are vanity metrics.

True success is measured in business outcomes. Are your P1 incidents decreasing in frequency? Is your MTTR dropping? Are your deployment "rollback" rates lower because you have better pre-production visibility?

We often refer back to our 3 Questions framework to keep strategy on track. If the work you are doing doesn't directly improve the reliability, performance, or cost-efficiency of the business, it is busy work. An observability practice should empower you to make data-driven decisions that actually move the needle on the company's bottom line.

The CMDB Mirage: Don’t Wait for Perfect

One of the most common traps we see is teams convincing themselves they cannot do meaningful observability until they have a solid, water-tight CMDB. In reality, that mindset creates paralysis. The CMDB becomes a prerequisite that is never quite finished, while outages, performance issues, and customer pain carry on regardless. A perfect model of every relationship in the estate might sound reassuring, but observability in the real world has to start with the environment you actually have, not the one you hope to document one day.

What matters more is whether your observability practice can speak the language of the business service. When something breaks, the question should not stop at a host name, a tag, or an IP address. The virtual desktop support agent needs to understand which user-facing capability is degraded. The operations team needs to know which service path is failing. Leadership and the board need clarity on what revenue, productivity, or customer experience is being impacted. That is where observability starts to create tangible value: not by perfecting infrastructure records in isolation, but by translating technical signals into service-level meaning that everyone can understand.

This is exactly why the earlier 3 Questions framework still matters. Smarter, more targeted collection will always outperform a bloated strategy built around waiting for a flawless data model. In our view, value-driven observability is about connecting the right telemetry to the right business outcomes as quickly as possible. A stronger CMDB can absolutely help over time, but it should support the practice, not delay it. If you are waiting for perfect before you begin, you are already losing momentum.

Leadership team focused on business outcomes in a war room setting

The Long Game: Practice, Not Project

One of the biggest mistakes a leadership team can make is treating observability as a "one-and-done" implementation. Technology stacks evolve, customer behaviours change, and your observability needs must evolve in lockstep.

How do you get better? How do you manage change? A true practice includes a feedback loop. Every post-mortem should result in a refinement of your monitoring. Every new feature launch should include "observability-as-code" as a standard requirement.

This requires a cultural shift. It’s about moving from a reactive "break-fix" mindset to a proactive "observe and optimise" approach. It involves managing the evolution of tools like Dynatrace or OpenTelemetry and ensuring your team has the skills to leverage them. Change management isn't just about the technology; it's about the people and processes that use it.

Digital bonsai tree representing the continuous care of an observability practice

The Visibility Platforms Perspective

Building an observability practice in a modern, complex digital ecosystem is unforgiving. The margin for error is slim, and the costs of getting it wrong are staggering. But when done right, it becomes a game-changer for operational excellence.

We don't just provide tools; we provide the strategic guidance to ensure those tools deliver tangible value. We act as an extension of your team, helping you navigate the "more data" trap and focusing your efforts where they will have the most impact.

Are you ready to stop "buying" tools and start building a practice? We anticipate that the most successful organisations in the coming years will be those that treat their telemetry with the same rigour as their financial data.

Is your observability strategy working for you, or are you working for it? We offer a free observability health check to help you identify the gaps in your current approach and build a roadmap for the future. Let’s make sure you’re asking the questions that actually matter.

At Visibility Platforms, we don't just see the data; we master the outcomes.