I am a professional problem solver. When I started applying my skills set as an Industrial Engineer to the IT environment, I was astonished at the maturity level of troubleshooting skills, specifically Root Cause Analysis, in this sector. I became aware of a whole generation of IT Professionals who have never been taught the fundamentals of deductive reasoning, problem solving, or critical thinking skills. There was (mis)use of the “5 Whys” technique and attempts to clone branded solutions to in-house situations, but little else in RCA. That has changed gradually over the last 10 years, but there are still some very basic problems and obstacles in getting RCA properly embedded into the Service Management environment.
As a company specializing in solution applications, we were at the forefront of this evolution. We have detected the following trends that will continue to shape RCA evolution in 2017 and beyond. We want to comment on six specific TRENDS and will continue this series of blogs by discussing each trend in more detail over the next few months. Here are the six most significant trends taking place within RCA methodologies as they apply to the IT sector:
- Understanding the real meaning of a Root Cause – Most IT Professionals do not make the distinction between the Technical Cause of a problem and its Root Cause. These terms are often used inter-changeably or worse case, there is no distinction made between them at all. The IT Professional who “gets this” distinction has a major advantage over his/her colleagues in determining a Root Cause quicker, cheaper and permanently. The trend of recognizing this distinction will grow in the years to come.
- Aligning the “handover” practices between IM and PM – Last year I worked with a significant number of clients who were adamant in wanting to improve the handover between IM and PM in the following ways:
- Having a seamless process between IM and PM.
- Improving the quality of data and ensuring no loss of data with this handover.
- Having templates that could be added to as the incident investigation moves from one area to the other area.
- Having a common incident investigation approach and language to eliminate any misunderstanding along the way.
- IT divisions are starting to adopt the rigor and discipline of a structured approach to incident investigations – The benefits of arriving at an effective and correct restoration plus an accurate root cause analysis is getting management’s attention. Too many outstanding incident tickets with too slow restoration times is becoming just too expensive for IT Divisions to reason away. Many of our clients would like to see how they are doing and are starting to employ templates and techniques with the appropriate metrics in the analysis process. They want the incident analysis data to be visible, which would make it easier for them to identify areas of improvement.
- The importance of a robust Knowledge Management Practice – This area has been overlooked for far too long. Management is realizing that having the correct knowledge, skills and expertise available will make a significant difference in becoming more effective in incident investigations. In 2016, we experienced a major increase in requests to help our clients establish a worthwhile knowledge data base on the 20% of incidents (objects and types of faults) that account for 80% of their incidents. In many cases, this exercise alone produced a major reduction in down time and level of incidents.
- C-Level managers starting to understand and embrace the potential benefits of Problem Management– We are still struggling to see a significant change in attitude towards a full commitment to establishing a highly skilled Problem Management practice. Unless you are an expert in the many ways causes, effects and consequences interact with each other, it is difficult to understand the exponential potential benefits. In 2016, our senior consultants made it their mission to sit down with C-Level Managers and rationally explain these benefits (with graphics), to improve their understanding of the opportunities. We estimated that a good problem investigation with a verified root cause will not only eliminate the recurrence of that particular incident, but would also prevent at least 6-8 other potential incidents.
- A major push towards standardizing incident investigation methodologies, processes, templates and techniques – Everyone knows that a common investigation approach with a common language makes a world of difference in how a company resolves its incidents. However, if you add to this common approach some simple-to-apply processes, templates with “worked questions”, additional benefits such as cross-silo collaboration, confidence in the approach, building on each other’s’ ideas and positive impact on attaining consensus can be easily realized.