IT and Cybersecurity News

When to conduct a Root Cause Analysis in Your IT Environment

Written by Systems Engineering | September 15, 2017

This blog post will review the systematic approach to trace back to the beginning of an issue or problem when quality breaks down within your IT environment. This process is known as a Root Cause Analysis, or RCA, and is an approach used to figure out how, where, and why an issued occurred. The best RCA will also answer the question of how to prevent the issue in the future.

Root Cause Analysis (RCA) is a method used to identify and document the potential causes of a problem. This should take place when an incident or breakdown in service occurs, particularly incidents or breakdowns that lead to undesired outcomes for clients.   

How do you know when an RCA is needed within your IT environment? Here are some criteria to determine when it's most likely necessary:

  • Complaint or feedback from client
  • Failure in service delivery
  • Unexpected downtime
  • Data loss
  • Undefined process
  • Financial or billing adjustment often ending in a write-off

Most problems that exist do not have one, clear identifiable cause. A root cause analysis can help determine possible contributing factors, such as what, how, and why something might have happened. The main objectives of walking through an RCA are:

  • Prevent reoccurrence of the issue
  • Continuous improvement of service quality
  • Document accountability of breakdown
  • Identify deficiencies in process or process documentation
  • Identify training needs and opportunities
  • Establish bar of excellence

Here's something very important to keep in mind:
RCA is a focus on process issues, not people issues.

Who gets involved?

RCA is done by a team of people that are stakeholders to the incident or breakdown; those who have an understanding of the problem in which a solution is needed. These individuals might also be the same folks implementing preventive action(s) aimed at eliminating root causes.

It is the responsibility of all employees to notify their manager or the Quality Management (QM) department when an incident occurs that meets the criteria above. And, it is the responsibility of the Manager and the QM department to determine the severity level of the incident, initiate and conduct the RCA, monitor and assist with the preventive action plans put into place, and follow up on the quality audit plans.

Employees may be asked to contribute information to a timeline of events, take part in establishing a root cause or causes, and developing and implementing preventive actions. During this process, it is important for the Manager and QM department to communicate the information that is needed from employees as well as how employees are expected to document the information requested.

The Process

Once it is identified that an RCA is needed, it's important to understand the high-level steps desired for conducting the analysis. They are as follows:

  • Identity incident
  • Notify Manager, Team Lead, or QM Department
  • Identify stakeholders
  • Timeline of events established by stakeholders
  • Identify breakdowns or deviations that led to the incident
  • Identify contributing factors to breakdowns
  • Ask why to each contributing factor to establish root cause
  • Brainstorm and determine preventative actions to eliminate root cause and repeat occurrence
  • Establish quality audit plan, to do's, and follow up

Focusing on the cause of the issue as opposed to the the resulting symptoms can help establish more sustainable and reliable solutions to problems.

If you want to think differently about how to approach problems within your IT environment, perhaps a third-party discussion is what you need to help discover hidden issues you are trying to solve.

Contact Systems Engineering at 888.624.6737, or fill in the form below and one of our knowledgeable representatives will be happy to reach out and discuss how our services can possibly remove your IT environment roadblocks.