Troubleshooting of Infrastructure Alarms

This chapter provides a description, severity, and troubleshooting procedure for each commonly encountered Cisco NCS 1010 infrastructure alarm and condition. When an alarm is raised, refer to its clearing procedure.

Disaster Recovery ISO Image Corruption

Default Severity: Major (MJ), Non-Service-Affecting (NSA)

Logical Object: Instorch

The Disaster Recovery ISO Image Corruption alarm is raised when the ISO image in the CPU or the motherboard disks is corrupted.

Clear the Disaster Recovery ISO Image Corruption Alarm

Procedure


This alarm is cleared when the ISO image is restored.

The alarm automatically downloads the image from the local repository and gets cleared within 12 hours.

If the alarm does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


ESD_INIT_ERR_E

Default Severity: Critical (CR), Service-Affecting (SA)

Logical Object: ESD

The ESD_INIT_ERR_E alarm is raised when the Ethernet Switch Driver (ESD) initialization fails.

Clear the ESD_INIT_ERR_E Alarm

Procedure


This alarm can be cleared when Cisco IOS XR automatically detects this alarm and clears it by resetting the switch.

If the alarm does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


FAN FAIL

Default Severity: Major (MJ), Service-Affecting (SA)

Logical Object: SPI-ENVMON

The FAN FAIL alarm is raised when one of the two fans fails. When a fan fails, the temperature rises above its normal operating range. This condition can trigger the

TEMPERATURE alarm.

Clear the FAN FAIL Alarm

Procedure


Verify that a fan is correctly inserted. The fan shall run immediately when correctly inserted.

If the alarm does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


FPD IN NEED UPGD

Default Severity: Major (MJ), Service-Affecting (SA)

Logical Object: SPI-FPD

The FPD IN NEED UPGD alarm is raised when the Field Programmable Device (FPD) image is not aligned with the available package version.

Clear the FPD IN NEED UPGD Alarm

Procedure


This alarm is cleared when the respective FPD is upgraded with the "upgrade hw-module location 0/x fpd y" command.

If the alarm does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


FAN-POWER-ERROR

Default Severity: Major (MJ), Non-Service-Affecting (NSA)

Logical Object: SPI-ENVMON

The FAN-POWER-ERROR alarm is raised when power to fan tray fails.

Clear the FAN-POWER-ERROR Alarm

Procedure


This alarm Is cleared when the power failure is recovered or the Online Insertion and Removal (OIR)of the fan tray is recovered.

If the alarm does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


FAN SPEED SENSOR 0: OUT OF TOLERANCE FAULT

Default Severity: Major (MJ), Service-Affecting (SA)

Logical Object: SPI-ENVMON

The FAN SPEED SENSOR 0: OUT OF TOLERANCE FAULT alarm is raised when one or more fans in the fan tray are faulty.

Clear the FAN SPEED SENSOR 0: OUT OF TOLERANCE FAULT Alarm

Procedure


This alarm is cleared when the fans in the chassis are replaced.

If the alarm does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


FAN-TRAY-REMOVAL

Default Severity: Major (MJ), Service-Affecting (SA)

Logical Object: SPI-ENVMON

The FAN-TRAY-REMOVAL alarm is raised when all the fan trays are removed from the chassis.

Clear the FAN-TRAY-REMOVAL Alarm

Procedure


This alarm is cleared when the fan trays are inserted into the chassis.

If the alarm does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


INSTALL IN PROGRESS

Default Severity: Major (MJ), Service-Affecting (SA)

Logical Object: SPI-INSTALL

The INSTALL IN PROGRESS alarm is raised when the install operation is in progress or if the "install commit" is not performed after activating a new image or package.

Clear the INSTALL IN PROGRESS Alarm

Procedure


Step 1

1) Wait until the install operation is over.

Step 2

2) Perform the "install commit" operation after the "install activate" operation.

If the alarm does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


PORT_AUTO_TUNE_ERR_E

Default Severity: Major (MJ), Service-Affecting (SA)

Logical Object: ESD

The PORT_AUTO_TUNE_ERR_E alarm is raised when the port auto-tuning fails.

Clear the PORT_AUTO_TUNE_ERR_E Alarm

Procedure


This alarm cannot be cleared unless another fault is detected on the link that causes Port Reset. For example: unstable link state.

If the alarm does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


PID-MISMATCH

Default Severity: Major (MJ), Service-Affecting (SA)

Logical Object: SPI-ENVMON

The PID-MISMATCH alarm is raised when one AC and one DC PSU are connected.

Clear the PID-MISMATCH Alarm

Procedure


This alarm is cleared when both the PIDS are same.

If the alarm does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


POWER MODULE OUTPUT DISABLED

Default Severity: Major (MJ), Service-Affecting (SA)

Logical Object: SPI-ENVMON

The POWER MODULE OUTPUT DISABLED alarm is raised when the power supply is disabled on the active Power Entry Module (PEM).

Clear the POWER MODULE OUTPUT DISABLED Alarm

Procedure


This alarm is cleared when the user enables the power supply.

If the alarm does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


POWER-MODULE-REDUNDANCY-LOST

Default Severity: Major (MJ), Service-Affecting (SA)

Logical Object: SPI-ENVMON

The POWER-MODULE-REDUNDANCY-LOST alarm is raised under one of the following conditions:

• If the power supply to the Power Supply Unit (PSU) is removed.

• If the PEM is removed.

Clear the POWER-MODULE-REDUNDANCY-LOST Alarm

Procedure


This alarm is cleared when the user re-inserts the power supply or connects the power cable again.

If the alarm does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


PORT_INIT_ERR_E

Default Severity: Major (MJ), Service-Affecting (SA)

Logical Object: ESD

The PORT_INIT_ERR_E alarm is raised when the port initialization fails.

Clear the PORT_INIT_ERR_E Alarm

Procedure


This alarm can be cleared when Cisco IOS XR automatically detects this alarm and clears it by resetting the port.

If the alarm does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


SPI_FLASH_CFG_INIT_ERR_E

Default Severity: Critical (CR), Service-Affecting (SA)

Logical Object: ESD

The SPI_FLASH_CFG_INIT_ERR_E alarm is raised when there is an unexpected or unsupported switch firmware version present.

Clear the SPI_FLASH_CFG_INIT_ERR_E Alarm

Procedure


The ESD automatically recovers the alarm by resetting the Aldrin and restarting the ESD process. If the alarm still exists, reload the 0/Rack.

If the alarm does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


SWITCH_ALL_PORTS_DOWN_ERR_E

Default Severity: Critical (CR), Service-Affecting (SA)

Logical Object: ESD

The SWITCH_ALL_PORTS_DOWN_ERR_E alarm is raised when all the monitored switch ports are down.

Clear the SWITCH_ALL_PORTS_DOWN_ERR_E Alarm

Procedure


This alarm can be cleared when Cisco IOS XR automatically detects this alarm and clears it by resetting the port.

If the alarm does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


SWITCH_CFG_INIT_ERR_E

Default Severity: Critical (CR), Service-Affecting (SA)

Logical Object: ESD

The SWITCH_CFG_INIT_ERR_E alarm is raised when the switch configuration fails.

Clear the SWITCH_CFG_INIT_ERR_E Alarm

Procedure


This alarm can be cleared when Cisco IOS XR automatically detects this alarm and clears it by resetting the switch.

If the alarm does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


SWITCH_CRITICAL_PORT_FAILED_E

Default Severity: Critical (CR), Service-Affecting (SA)

Logical Object: ESD

The SWITCH_CRITICAL_PORT_FAILED_E alarm is raised when there is a Critical Port failure.

Clear the SWITCH_CRITICAL_PORT_FAILED_E Alarm

Procedure


The ESD process auto recovers the alarm by resetting the Aldrin and restarting the ESD process. If the alarm still exists, reload the 0/Rack.

If the alarm does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


SWITCH_DMA_ERR_E

Default Severity: Critical (CR), Service-Affecting (SA)

Logical Object: ESD

The SWITCH_DMA_ERR_E alarm is raised when the switch Direct Memory Access (DMA) engine fails.

Clear the SWITCH_DMA_ERR_E Alarm

Procedure


This alarm can be cleared when Cisco IOS XR automatically detects this alarm and clears it by resetting the switch.

If the alarm does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


SWITCH_EEPROM_INIT_ERR_E

Default Severity: Critical (CR), Service-Affecting (SA)

Logical Object: ESD

The SWITCH_EEPROM_INIT_ERR_E alarm is raised when the Switch EEPROM initialization fails.

Clear the SWITCH_EEPROM_INIT_ERR_E Alarm

Procedure


This alarm can be cleared when Cisco IOS XR automatically detects this alarm and clears it by resetting the switch.

If the alarm does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


SWITCH_FDB_ERR_E

Default Severity: Critical (CR), Service-Affecting (SA)

Logical Object: ESD

The SWITCH_FDB_ERR_E alarm is raised when the Switch Forwarding Database (fdb) operation fails.

Clear the SWITCH_FDB_ERR_E Alarm

Procedure


This alarm can be cleared when Cisco IOS XR automatically detects this alarm and clears it by resetting the switch.

If the alarm does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


SWITCH_FDB_MAC_ADD_ERR_E

Default Severity: Critical (CR), Service-Affecting (SA)

Logical Object: ESD

The SWITCH_FDB_MAC_ADD_ERR_E alarm is raised when the switch firmware is unable to add a MAC address to its database.

Clear the SWITCH_FDB_MAC_ADD_ERR_E Alarm

Procedure


This alarm can not be cleared by manual recovery.

If the alarm does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


SWITCH_FIRMWARE_BOOT_FAIL_E

Default Severity: Critical (CR), Non-Service-Affecting (NSA)

Logical Object: ESD

The SWITCH_FIRMWARE_BOOT_FAIL_E alarm is raised when the switch firmware boot fails.

Clear the SWITCH_FIRMWARE_BOOT_FAIL_E Alarm

Procedure


This alarm can be cleared when the ESD auto clears the alarm by resetting the switch.

If the alarm does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


SWITCH_NOT_DISCOVERED_E

Default Severity: Critical (CR), Service-Affecting (SA)

Logical Object: ESD

The SWITCH_NOT_DISCOVERED_E alarm is raised when the switch is not discovered on the Peripheral Component Interconnect express (PCIe) bus.

Clear the SWITCH_NOT_DISCOVERED_E Alarm

Procedure


This alarm can be cleared when the ESD auto clears the alarm by resetting the switch.

If the alarm does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


SWITCH_RESET_RECOVERY_FAILED_E

Default Severity: Critical (CR), Service-Affecting (SA)

Logical Object: ESD

The SWITCH_RESET_RECOVERY_FAILED_E alarm is raised when the Switch Reset opeartion did not recover the switch.

Clear the SWITCH_RESET_RECOVERY_FAILED_E Alarm

Procedure


This alarm can be cleared when the ESD auto clears the alarm by reloading the card.

If the alarm does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


TEMPERATURE

Default Severity: Minor (MN), Major (MJ), Critical (CR), Non-Service-Affecting (NSA)

Logical Object: SPI-ENVMON

The TEMPERATURE alarm is raised when the temperature is out of the operating range.

Clear the TEMPERATURE Alarm

Procedure


This alarm is cleared when the temperature falls within the operating range.

Ensure that there are no airflow obstructions, fans are working fine, and the ambient temperature is below 30 degrees.

Use "show environment" to check fan speed and temperature values. Use “show alarms brief system active” to check any alarms on fan trays.

If the alarm does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


UNSTABLE_LINK_E

Default Severity: Major (MJ), Service-Affecting (SA)

Logical Object: ESD

The UNSTABLE_LINK_E alarm is raised when there is an unstable link with high number of UP and DOWN state changes.

Clear the UNSTABLE_LINK_E Alarm

Procedure


This alarm can be cleared when the ESD auto clears the alarm by resetting the port.

If the alarm does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


VOLTAGE

Default Severity: Minor (MN), Major (MJ), Critical (CR), Non-Service-Affecting (NSA)

Logical Object: SPI-ENVMON

The VOLTAGE alarm is raised when the voltage is out of the operating range.

Clear the VOLTAGE Alarm

Procedure


This alarm is cleared when the voltage falls within the operating range.

If the alarm does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


OPTICAL-MOD-ABSENT

Default Severity: Critical (CR), Service-Affecting (SA)

Logical Object: Phy1_mgmt

The Line card Is not Present in the Chassis, further contact TAC.

Clear the Optical MOD Absent Alarm

SUMMARY STEPS

  1. This alarm clears when the user reinserts the line card and connects the fan.

DETAILED STEPS

Command or Action Purpose

This alarm clears when the user reinserts the line card and connects the fan.

If the alarm does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).

OUT_OF_COMPLIANCE

Default Severity: Critical (CR), Service-Affecting (SA)

Logical Object: plat_sl_client

One Or More Entitlements Are Out Of Compliance.

Clear Out of Compliance Alarm

SUMMARY STEPS

  1. This alarm clears when required number of additional licenses are purchased.

DETAILED STEPS


This alarm clears when required number of additional licenses are purchased.

If the alarm does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


COMM_FAIL

Default Severity: Major(MJ), Service-Affecting (SA)

Logical Object: plat_sl_client

Communications Failure With Cisco Licensing Cloud.

Clear Communication Fail Alarm

SUMMARY STEPS

  1. This alarm is cleared when the license server is reachable from the network element.

DETAILED STEPS


This alarm is cleared when the license server is reachable from the network element.

If the alarm does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


SIA_GRACE_PERIOD_REMAINING

Default Severity: Minor (MN), Non-Service-Affecting (NSA)

Logical Object: plat_sl_client

SW Upgrade is still allowed as SIA Grace Period is remaining.

Clear SIA Grace Period Remaining

SUMMARY STEPS

  1. This alarm is cleared when SIA licenses are purchased.

DETAILED STEPS


This alarm is cleared when SIA licenses are purchased.

If the alarm does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


UPGRADE_LICENSE_UPGRADE_BLOCKED

Default Severity: Major(MJ), Service-Affecting (SA)

Logical Object: plat_sl_client

SW Upgrade will be blocked as Upgrade License Grace Period has expired.

Clear Upgrade License Upgrade Blocked

SUMMARY STEPS

  1. This alarm is cleared when required SIA licenses are purchased.

DETAILED STEPS


This alarm is cleared when required SIA licenses are purchased.

If the alarm does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


UPGRADE_LICENSE_GRACE_PERIOD_REMAINING

Default Severity: Minor (MN), Non-Service-Affecting (NSA)

Logical Object: plat_sl_client

SW Upgrade is still allowed as Upgrade License Grace Period is remaining.

Clear Upgrade License Grace Period Remaining

SUMMARY STEPS

  1. This alarm is cleared when SIA licenses are purchased.

DETAILED STEPS


This alarm is cleared when SIA licenses are purchased.

If the alarm does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


SIA_UPGRADE_BLOCKED

Default Severity: Major(MJ), Service-Affecting (SA)

Logical Object: plat_sl_client

SW Upgrade will be blocked as SIA Grace Period has expired.

Clear SIA Grace Period Remaining

SUMMARY STEPS

  1. This alarm is cleared when the SIA licences are purchase.

DETAILED STEPS


This alarm is cleared when the SIA licences are purchase.

If the alarm does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


DISASTER_RECOVERY_UNAVAILABLE_ALARM

Default Severity: Major(MJ), Service-Affecting (SA)

Logical Object: Instorch

Disaster recovery boot is currently unavailable due to chassis SSD corruption.

Clear the Disaster Recovery Unavailable Alarm

SUMMARY STEPS

  1. This alarm clears automatically after the upgrade.

DETAILED STEPS


This alarm clears automatically after the upgrade.

If the alarm does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).