Skip to main content

Take the War Out of the War Room

Nik Koutsoukos

The development of new and more complex business technologies happens so quickly now that they are starting to outpace the rate at which IT organizations can effectively monitor the entire IT infrastructure and react to problems. This is particularly true as more enterprises adopt a hybrid model with some resources managed in the data center and some in cloud or SaaS-based environments. Simultaneously, IT organizations have become increasingly siloed as different personnel develop skillsets specific to different pieces of the IT infrastructure, such as database management, the network, information security, etc.

As a result, the “war room” – where IT personnel gather to diagnose and fix a problem – more often than not devolves into a session of finger pointing and delays. Remedying this situation demands a new approach to managing performance that enables IT to become more proactive instead of reactive, and more collaborative instead of siloed.

Riverbed recently held a webinar on this topic, and one of our presenters was Forrester Vice President and Principal Analyst Jean-Pierre Garbani. He opened his remarks with a statement that nicely summarizes how predictive analytics technologies have radically reshaped how any company does (or should do) business: “Every company board, IT organization and leadership team should assume that there are – or will be – new ways to more efficiently service customers.”

In other words, counting on the luxury of being able to time the development and release of new products, applications or services to slow-moving market trends is a thing of the past. Just ask the taxicab industry. After more than a century of enjoying a monopoly, it suddenly finds itself in a battle for its life against data-driven services like Uber and Lyft. Or consider the examples of Kodak, Blockbuster, Tower Records or Borders for evidence of how quickly a long-established business model can become obsolete very quickly.

Today companies can collect massive amounts of data and use predictive analytics technologies to determine and use invaluable information such as customer buying trends, supply chain capacity, commodity price futures, or to provide customers with data-driven offers. Enterprises are pouring money and energy into creating innovative applications and getting them to market faster, better and cheaper. Agile and DevOps capabilities can reduce release cycles from months to mere days, and the funding for these investments typically comes by spending reductions in infrastructure.

These complexities can quickly overwhelm human abilities and makes the job of resolving problems and maintaining systems increasingly difficult and time-consuming. That impacts service quality. Forrester has conducted a number of surveys and found that 56 percent of IT organizations resolve less than 75 percent of application performance problems in 24 hours, and in some cases, those performance issues can lag for months before resolution. Consider as examples outages that affect services like Gmail or Dropbox.

The root of the problem lies with the fact that IT grew up around domains such as the network, systems, applications, databases, etc., and they needed domain data to do their jobs. That has driven a proliferation of domain-centric point tools, which helps each domain group, but also means that for even very simple transactions, domain teams only see part of the transaction, such as packet data or metrics from an app server. This incomplete visibility means domain teams see different things due to inconsistent data sets and differing analytic approaches. That leads to a lack of collaboration, warring tribes, and ultimately conflicting conclusions that inhibit fast time to resolution.

For example, last year Adobe’s move to cloud-based software back fired momentarily when database maintenance resulted in application availability issues. The company’s Creative Cloud service was unavailable for about a day, leaving users unable to access the web versions of apps such as Photoshop and Premiere. In total, the outage was said to have impacted at least a million subscribers. Other Adobe-related products were impacted during the downtime as well, including Adobe's Business Catalyst analytics tool. The company has since implemented procedures to prevent a similar outage from happening again.

This instance highlights the area where companies typically struggle to solve performance issues. Once a problem occurs, it usually doesn’t take long for a frustrated employee or customer to raise it with IT, and once the specific cause is identified, fixing and validating that fix should not take long. Where the delays occur is in the middle of that timeline: the diagnosis, or what Forrester refers to as the “Mean Time to Know” (MTTK).

Because an IT organization is typically divided into independent silos that have little interaction with each other, the diagnosis process cannot be a collaborative effort. The war room where personnel gather to battle the problem becomes a war against each other. Instead of one collaborative effort, each silo uses its own specialized tools to evaluate the issue, and can typically only determine the fault lies with another group, but does not know which one. So the problem gets passed from group to group, a tedious and time-wasting exercise.

We will always have different, specialized groups within one IT organization to oversee services and applications such as end-user experiences, application monitoring, database monitoring, transaction mapping and infrastructure monitoring. What must change is the elimination of the individual dashboards each group uses to monitor its own domains. The key is to roll all of that reporting information in real-time into one global dashboard that provides broad domain monitoring capabilities that can be abstracted and analyzed in a way that focuses on services and transactions. Providing this single source of truth will reconcile technology silos and support better incident and problem management processes.

In other words, you take the war out of the war room. Each participant can find the right information needed to perform his or her tasks while also sharing that information with their peers so they can do the same.

Implementing this new approach to performance management will be a radical change for many organizations, and there may be initial resistance to overcome as groups worry their individual roles are at risk of marginalization. Again, the ultimate goal is not to eliminate specialized groups within one IT organization, it is to improve the collaboration among those groups. The result is performance management that is much less reactive and must wait for a problem to occur before taking action. Universal real-time monitoring can enable IT to anticipate when and where a problem may arise and fix it before the end user or customer even notices it. The most productive end user and happiest customer can often be the ones you never hear from because their experiences are always positive. That kind of silence is golden.

Hot Topics

The Latest

There's an image problem with mobile app security. While it's critical for highly regulated industries like financial services, it is often overlooked in others. This usually comes down to development priorities, which typically fall into three categories: user experience, app performance, and app security. When dealing with finite resources such as time, shifting priorities, and team skill sets, engineering teams often have to prioritize one over the others. Usually, security is the odd man out ...

Image
Guardsquare

IT outages, caused by poor-quality software updates, are no longer rare incidents but rather frequent occurrences, directly impacting over half of US consumers. According to the 2024 Software Failure Sentiment Report from Harness, many now equate these failures to critical public health crises ...

In just a few months, Google will again head to Washington DC and meet with the government for a two-week remedy trial to cement the fate of what happens to Chrome and its search business in the face of ongoing antitrust court case(s). Or, Google may proactively decide to make changes, putting the power in its hands to outline a suitable remedy. Regardless of the outcome, one thing is sure: there will be far more implications for AI than just a shift in Google's Search business ... 

Image
Chrome

In today's fast-paced digital world, Application Performance Monitoring (APM) is crucial for maintaining the health of an organization's digital ecosystem. However, the complexities of modern IT environments, including distributed architectures, hybrid clouds, and dynamic workloads, present significant challenges ... This blog explores the challenges of implementing application performance monitoring (APM) and offers strategies for overcoming them ...

Service disruptions remain a critical concern for IT and business executives, with 88% of respondents saying they believe another major incident will occur in the next 12 months, according to a study from PagerDuty ...

IT infrastructure (on-premises, cloud, or hybrid) is becoming larger and more complex. IT management tools need data to drive better decision making and more process automation to complement manual intervention by IT staff. That is why smart organizations invest in the systems and strategies needed to make their IT infrastructure more resilient in the event of disruption, and why many are turning to application performance monitoring (APM) in conjunction with high availability (HA) clusters ...

In today's data-driven world, the management of databases has become increasingly complex and critical. The following are findings from Redgate's 2025 The State of the Database Landscape report ...

With the 2027 deadline for SAP S/4HANA migrations fast approaching, organizations are accelerating their transition plans ... For organizations that intend to remain on SAP ECC in the near-term, the focus has shifted to improving operational efficiencies and meeting demands for faster cycle times ...

As applications expand and systems intertwine, performance bottlenecks, quality lapses, and disjointed pipelines threaten progress. To stay ahead, leading organizations are turning to three foundational strategies: developer-first observability, API platform adoption, and sustainable test growth ...

It never ceases to amaze me when I examine the curricula of specialist courses that there are either no prerequisites, or very minor ones. I feel that that the analogy above makes the case for having general IT knowledge, even for someone who wishes to specialize in an area of IT, such as Cybersecurity or Cloud computing ...

Take the War Out of the War Room

Nik Koutsoukos

The development of new and more complex business technologies happens so quickly now that they are starting to outpace the rate at which IT organizations can effectively monitor the entire IT infrastructure and react to problems. This is particularly true as more enterprises adopt a hybrid model with some resources managed in the data center and some in cloud or SaaS-based environments. Simultaneously, IT organizations have become increasingly siloed as different personnel develop skillsets specific to different pieces of the IT infrastructure, such as database management, the network, information security, etc.

As a result, the “war room” – where IT personnel gather to diagnose and fix a problem – more often than not devolves into a session of finger pointing and delays. Remedying this situation demands a new approach to managing performance that enables IT to become more proactive instead of reactive, and more collaborative instead of siloed.

Riverbed recently held a webinar on this topic, and one of our presenters was Forrester Vice President and Principal Analyst Jean-Pierre Garbani. He opened his remarks with a statement that nicely summarizes how predictive analytics technologies have radically reshaped how any company does (or should do) business: “Every company board, IT organization and leadership team should assume that there are – or will be – new ways to more efficiently service customers.”

In other words, counting on the luxury of being able to time the development and release of new products, applications or services to slow-moving market trends is a thing of the past. Just ask the taxicab industry. After more than a century of enjoying a monopoly, it suddenly finds itself in a battle for its life against data-driven services like Uber and Lyft. Or consider the examples of Kodak, Blockbuster, Tower Records or Borders for evidence of how quickly a long-established business model can become obsolete very quickly.

Today companies can collect massive amounts of data and use predictive analytics technologies to determine and use invaluable information such as customer buying trends, supply chain capacity, commodity price futures, or to provide customers with data-driven offers. Enterprises are pouring money and energy into creating innovative applications and getting them to market faster, better and cheaper. Agile and DevOps capabilities can reduce release cycles from months to mere days, and the funding for these investments typically comes by spending reductions in infrastructure.

These complexities can quickly overwhelm human abilities and makes the job of resolving problems and maintaining systems increasingly difficult and time-consuming. That impacts service quality. Forrester has conducted a number of surveys and found that 56 percent of IT organizations resolve less than 75 percent of application performance problems in 24 hours, and in some cases, those performance issues can lag for months before resolution. Consider as examples outages that affect services like Gmail or Dropbox.

The root of the problem lies with the fact that IT grew up around domains such as the network, systems, applications, databases, etc., and they needed domain data to do their jobs. That has driven a proliferation of domain-centric point tools, which helps each domain group, but also means that for even very simple transactions, domain teams only see part of the transaction, such as packet data or metrics from an app server. This incomplete visibility means domain teams see different things due to inconsistent data sets and differing analytic approaches. That leads to a lack of collaboration, warring tribes, and ultimately conflicting conclusions that inhibit fast time to resolution.

For example, last year Adobe’s move to cloud-based software back fired momentarily when database maintenance resulted in application availability issues. The company’s Creative Cloud service was unavailable for about a day, leaving users unable to access the web versions of apps such as Photoshop and Premiere. In total, the outage was said to have impacted at least a million subscribers. Other Adobe-related products were impacted during the downtime as well, including Adobe's Business Catalyst analytics tool. The company has since implemented procedures to prevent a similar outage from happening again.

This instance highlights the area where companies typically struggle to solve performance issues. Once a problem occurs, it usually doesn’t take long for a frustrated employee or customer to raise it with IT, and once the specific cause is identified, fixing and validating that fix should not take long. Where the delays occur is in the middle of that timeline: the diagnosis, or what Forrester refers to as the “Mean Time to Know” (MTTK).

Because an IT organization is typically divided into independent silos that have little interaction with each other, the diagnosis process cannot be a collaborative effort. The war room where personnel gather to battle the problem becomes a war against each other. Instead of one collaborative effort, each silo uses its own specialized tools to evaluate the issue, and can typically only determine the fault lies with another group, but does not know which one. So the problem gets passed from group to group, a tedious and time-wasting exercise.

We will always have different, specialized groups within one IT organization to oversee services and applications such as end-user experiences, application monitoring, database monitoring, transaction mapping and infrastructure monitoring. What must change is the elimination of the individual dashboards each group uses to monitor its own domains. The key is to roll all of that reporting information in real-time into one global dashboard that provides broad domain monitoring capabilities that can be abstracted and analyzed in a way that focuses on services and transactions. Providing this single source of truth will reconcile technology silos and support better incident and problem management processes.

In other words, you take the war out of the war room. Each participant can find the right information needed to perform his or her tasks while also sharing that information with their peers so they can do the same.

Implementing this new approach to performance management will be a radical change for many organizations, and there may be initial resistance to overcome as groups worry their individual roles are at risk of marginalization. Again, the ultimate goal is not to eliminate specialized groups within one IT organization, it is to improve the collaboration among those groups. The result is performance management that is much less reactive and must wait for a problem to occur before taking action. Universal real-time monitoring can enable IT to anticipate when and where a problem may arise and fix it before the end user or customer even notices it. The most productive end user and happiest customer can often be the ones you never hear from because their experiences are always positive. That kind of silence is golden.

Hot Topics

The Latest

There's an image problem with mobile app security. While it's critical for highly regulated industries like financial services, it is often overlooked in others. This usually comes down to development priorities, which typically fall into three categories: user experience, app performance, and app security. When dealing with finite resources such as time, shifting priorities, and team skill sets, engineering teams often have to prioritize one over the others. Usually, security is the odd man out ...

Image
Guardsquare

IT outages, caused by poor-quality software updates, are no longer rare incidents but rather frequent occurrences, directly impacting over half of US consumers. According to the 2024 Software Failure Sentiment Report from Harness, many now equate these failures to critical public health crises ...

In just a few months, Google will again head to Washington DC and meet with the government for a two-week remedy trial to cement the fate of what happens to Chrome and its search business in the face of ongoing antitrust court case(s). Or, Google may proactively decide to make changes, putting the power in its hands to outline a suitable remedy. Regardless of the outcome, one thing is sure: there will be far more implications for AI than just a shift in Google's Search business ... 

Image
Chrome

In today's fast-paced digital world, Application Performance Monitoring (APM) is crucial for maintaining the health of an organization's digital ecosystem. However, the complexities of modern IT environments, including distributed architectures, hybrid clouds, and dynamic workloads, present significant challenges ... This blog explores the challenges of implementing application performance monitoring (APM) and offers strategies for overcoming them ...

Service disruptions remain a critical concern for IT and business executives, with 88% of respondents saying they believe another major incident will occur in the next 12 months, according to a study from PagerDuty ...

IT infrastructure (on-premises, cloud, or hybrid) is becoming larger and more complex. IT management tools need data to drive better decision making and more process automation to complement manual intervention by IT staff. That is why smart organizations invest in the systems and strategies needed to make their IT infrastructure more resilient in the event of disruption, and why many are turning to application performance monitoring (APM) in conjunction with high availability (HA) clusters ...

In today's data-driven world, the management of databases has become increasingly complex and critical. The following are findings from Redgate's 2025 The State of the Database Landscape report ...

With the 2027 deadline for SAP S/4HANA migrations fast approaching, organizations are accelerating their transition plans ... For organizations that intend to remain on SAP ECC in the near-term, the focus has shifted to improving operational efficiencies and meeting demands for faster cycle times ...

As applications expand and systems intertwine, performance bottlenecks, quality lapses, and disjointed pipelines threaten progress. To stay ahead, leading organizations are turning to three foundational strategies: developer-first observability, API platform adoption, and sustainable test growth ...

It never ceases to amaze me when I examine the curricula of specialist courses that there are either no prerequisites, or very minor ones. I feel that that the analogy above makes the case for having general IT knowledge, even for someone who wishes to specialize in an area of IT, such as Cybersecurity or Cloud computing ...