Frameworx Home

Application Framework (TAM)

Business Process Framework (eTOM)

Business Process Framework Flows

Information Framework (SID)

Business Metrics High Level

All Diagrams

Frameworx Processes

Frameworx Applications

Information Framework ABEs

Frameworx Metrics

Views

Frameworx Application: Fault Correlation & Root Cause Analysis

Category: (2) TAM Application Type

Application Identifier: 7.10.2

Maturity Level: 4

Overview

Fault Correlation & Root Cause Analysis collects the various fault events in the network as well as other relevant information such as network topology, and relates these events, reducing the number of raw events to some smaller number.  Root Cause Analysis (RCA) enables the end user to quickly determine the root cause of a problem in the network. These applications have a unique role in mediating network alarms with topology and configuration data.

Functionality

Fault Correlation & Root Cause Analysis functionality includes the following:

  • Alarm Correlation (the ability to collect all relevant fault events along with other relevant information and reduce them to some smaller manageable number).  This can include:
    • Alarm de-duplication – first level of alarm reduction based on pre-defined user criteria. Alarm de-duplication is designed to eliminate repeated events to reduce the amount of “noise” from the network. The application should provide end user with capability to define rules for de-duplication.
    • Alarm auto-clearing – ability of the application to correlate a previous alarm with a clear-alarm received from the source (NE, NMS, and EMS). The application should deliver “out-of-the-box” auto-clearing capabilities for each device type/EMS/NMS supported, as well as capabilities for end users to define their own auto-clearing rules.
    • Alarm thresholding – ability of the application to handle various thresholding scenarios such as alarm flapping and integration with performance management systems to receive threshold crossing alarms, as well as generate synthetic threshold alarms based on pre-defined user conditions. The application should provide end user the ability to maintain “out-of-the box” rules, as well as develop their own rules for threshold management.
    • Correlating alarms with supporting data (topology, configuration), including
      • intra and inter-element.
      • inter-element (including up/down the various network layers)
      • service-based; In order for the application to do topology based correlation, the application must be “topology aware”. Topology awareness can be achieved through autodiscovery or integration with an inventory management application. Inter-element and service based correlation can only be achieved if the inventory data is valid and is available for integration with the correlation application.
      • Alarm enrichment (external database connectivity)
      • Ability to associate services to the physical aspects of the network.
      • Filter, summarize, and reduce displayed alarms
      • Consolidation of alarms
      • Consolidating alarms across technology
      • Consolidating alarms across elements
      • Present to alarm console
      • Graphical display of fault / topology overlay
      • Provide alarm to other systems
      • Store the alarms and root cause for extended periods
  • Root Cause Analysis – (RCA) ability to pinpoint the root cause of the problem or in some instances probable cause of the problem. The application should have the ability to:
    • Root Cause isolation based on correlation analysis 
    • Fault isolation
    • Network Element / network layer attribution
    • Alarm consolidation / substitution as well as alarm suppression of the sympathetic alarms.
    • Problem identification / initiation (ticket creation). Once Root Cause/Probable cause is determined, the application should have the ability to integrate with trouble management application for manual/automated ticket creation.
    • Resolution initiation (testing, solution identification/ownership, knowledge base index). The application should have capability to integrate with various testing applications. Integration with testing should be rules bases.
    • Knowledge of topology
    • Present to alarm console
    • Drill down from root cause into details

Supported Business Services

(2) TAM Application Type Fault Correlation & Root Cause Analysis

Appears on these diagrams:

Frameworx Processes (Automated TAM to eTOM)

Frameworx Domains (Horizontal)


This was created from the Frameworx 16.0 Model


Created from the TM Forum Model Frameworx 16.0.0 on 6/13/2016 at 22:31