You are on page 1of 73

Quidway S9300 Terabit Routing Switch

V100R001

Emergency Maintenance

Issue

01

Date

2009-04-15

Huawei Proprietary and Confidential


Copyright Huawei Technologies Co., Ltd.

Huawei Technologies Co., Ltd. provides customers with comprehensive technical support and service. For any
assistance, please contact our local office or company headquarters.

Huawei Technologies Co., Ltd.


Address:

Huawei Industrial Base


Bantian, Longgang
Shenzhen 518129
People's Republic of China

Website:

http://www.huawei.com

Email:

support@huawei.com

Copyright Huawei Technologies Co., Ltd. 2009. All rights reserved.


No part of this document may be reproduced or transmitted in any form or by any means without prior written
consent of Huawei Technologies Co., Ltd.

Trademarks and Permissions


and other Huawei trademarks are the property of Huawei Technologies Co., Ltd.
All other trademarks and trade names mentioned in this document are the property of their respective holders.

Notice
The information in this document is subject to change without notice. Every effort has been made in the
preparation of this document to ensure accuracy of the contents, but the statements, information, and
recommendations in this document do not constitute a warranty of any kind, express or implied.

Huawei Proprietary and Confidential


Copyright Huawei Technologies Co., Ltd.

Quidway S9300 Terabit Routing Switch


Emergency Maintenance

Contents

Contents
About This Document.....................................................................................................................1
1 Overview of Emergency Maintenance...................................................................................1-1
1.1 Definition of Emergency Maintenance...........................................................................................................1-2
1.2 Definition of Emergencies..............................................................................................................................1-2
1.3 Initiation of Emergency Maintenance.............................................................................................................1-2
1.4 Guidelines for Emergency Maintenance.........................................................................................................1-3
1.5 Flow for Emergency Maintenance..................................................................................................................1-3
1.5.1 Notifying Huawei of the Emergency.....................................................................................................1-5
1.5.2 Locating the Fault...................................................................................................................................1-5
1.5.3 Collecting Fault Information..................................................................................................................1-5
1.5.4 Rectifying the Fault................................................................................................................................1-6
1.5.5 Obtaining Help.......................................................................................................................................1-7
1.5.6 Checking the Handling Result................................................................................................................1-7
1.5.7 Recording Information About Emergency Maintenance.......................................................................1-7
1.6 Emergency Maintenance Precautions.............................................................................................................1-8
1.7 Technical Support...........................................................................................................................................1-9

2 Emergency Maintenance for Device Faults...........................................................................2-1


2.1 Overview.........................................................................................................................................................2-2
2.2 Flow for Handling Device Faults....................................................................................................................2-2
2.3 Directions for Emergency Maintenance..........................................................................................................2-3
2.3.1 Failure to Log In to a System Through the Console Interface...............................................................2-3
2.3.2 Failure to Start a System........................................................................................................................2-7
2.3.3 Abnormality of the Board Status............................................................................................................2-9
2.3.4 Abnormality of the Interface Status.....................................................................................................2-11

3 Emergency Maintenance for Service Faults..........................................................................3-1


3.1 Overview.........................................................................................................................................................3-2
3.2 Flow for Handling Service Faults...................................................................................................................3-2
3.3 Guide to Emergency Maintenance..................................................................................................................3-3
3.3.1 Failure to Forward IP Unicast Packets...................................................................................................3-4
3.3.2 Failure to Forward IP Multicast Packets................................................................................................3-8
3.3.3 Failure to Forward MPLS VPN Packets..............................................................................................3-14

4 Guide to Fault Information Collection..................................................................................4-1


Issue 01 (2009-04-15)

Huawei Proprietary and Confidential


Copyright Huawei Technologies Co., Ltd.

Contents

Quidway S9300 Terabit Routing Switch


Emergency Maintenance

4.1 Overview.........................................................................................................................................................4-2
4.2 Collection of Basic Fault Information.............................................................................................................4-2
4.3 Collection of Device Fault Information..........................................................................................................4-2

5 Guide to System Reboot...........................................................................................................5-1


5.1 Overview.........................................................................................................................................................5-2
5.2 Preparation for System Reboot.......................................................................................................................5-2
5.3 Guide to System Reboot..................................................................................................................................5-2
5.3.1 Running Command Lines.......................................................................................................................5-3
5.3.2 Pressing the RESET Button on the MCU/SRU.....................................................................................5-4
5.3.3 Switching Off and Switching On the System.........................................................................................5-4
5.3.4 Operating Through the NMS..................................................................................................................5-4
5.4 Verification of System Reboot........................................................................................................................5-5
5.4.1 Displaying Information About System Reboot......................................................................................5-5
5.4.2 Checking the Software Version and Configuration File........................................................................5-6
5.5 Handling of a System Reboot Failure.............................................................................................................5-7

6 Emergency Maintenance Record Table................................................................................. 6-1


6.1 Notice of Emergency Maintenance.................................................................................................................6-2
6.2 Emergency Record Table................................................................................................................................6-2

7 System Upgrading Through BIOS..........................................................................................7-1

ii

Huawei Proprietary and Confidential


Copyright Huawei Technologies Co., Ltd.

Issue 01 (2009-04-15)

Quidway S9300 Terabit Routing Switch


Emergency Maintenance

Figures

Figures
Figure 1-1 Flowchart of emergency maintenance................................................................................................1-4
Figure 1-2 Flowchart for identifying the type of a fault.......................................................................................1-6
Figure 2-1 Flowchart for handling device faults..................................................................................................2-3
Figure 2-2 Flowchart for handling the failed login to a system through the console interface............................2-5
Figure 2-3 Flowchart for handling the failed system start...................................................................................2-8
Figure 2-4 Flowchart for handling the abnormality of the board status.............................................................2-10
Figure 2-5 Flowchart for handling the abnormality of the interface status........................................................2-12
Figure 3-1 Flowchart for handling service faults.................................................................................................3-3
Figure 3-2 Flowchart for handling the failure to forward IP unicast packets......................................................3-6
Figure 3-3 Flowchart for handling the failure to forward IP multicast packets.................................................3-11
Figure 3-4 Flowchart for handling the failure to forward MPLS VPN packets.................................................3-15

Issue 01 (2009-04-15)

Huawei Proprietary and Confidential


Copyright Huawei Technologies Co., Ltd.

iii

Quidway S9300 Terabit Routing Switch


Emergency Maintenance

Tables

Tables
Table 1-1 Methods of identifying the fault type...................................................................................................1-6
Table 2-1 Collection of information about the failure to log in to a system through the console interface.........2-4
Table 2-2 Collection of information about the failure to start a system...............................................................2-7
Table 2-3 Collection of information about the abnormality of the board status................................................2-10
Table 2-4 Collection of information about the abnormality of the interface status............................................2-11
Table 3-1 Collection of information about the failure to forward IP unicast packets..........................................3-4
Table 3-2 Collection of information about the failure to forward IP multicast packets.......................................3-9
Table 3-3 Collection of information about the failure to forward MPLS VPN packets.....................................3-14
Table 4-1 Collection of basic fault information...................................................................................................4-2
Table 4-2 Collection of device fault information.................................................................................................4-3
Table 6-1 Notice of emergency maintenance.......................................................................................................6-2

Issue 01 (2009-04-15)

Huawei Proprietary and Confidential


Copyright Huawei Technologies Co., Ltd.

Quidway S9300 Terabit Routing Switch


Emergency Maintenance

About This Document

About This Document


Purpose
This document describes how to rectify the device faults and service faults of the Quidway
S9300 Terabit Routing Switch. It also provides instructions for fault information collection and
device reboot.

Related Versions
The following table lists the product versions related to this document.
Product Name

Version

S9300

V100R001

Intended Audience
This document is intended for:
l

Policy planning engineers

Installation and commissioning engineers

NM configuration engineers

Technical support engineers

Organization
This document is organized as follows.

Issue 01 (2009-04-15)

Chapter

Description

1 Overview of Emergency
Maintenance

Provides the definitions, causes, principle, flowcharts, and


precautions of emergency maintenance.

2 Emergency Maintenance
for Device Faults

Describes the emergency maintenance for device faults,


focusing on fault clearance and service recovery and not fault
rectification.

Huawei Proprietary and Confidential


Copyright Huawei Technologies Co., Ltd.

Quidway S9300 Terabit Routing Switch


Emergency Maintenance

About This Document

Chapter

Description

3 Emergency Maintenance
for Service Faults

Describes the emergency maintenance for service faults,


focusing on fault clearance and service recovery rather than
fault rectification.

4 Guide to Fault
Information Collection

Describes how to collect and back up fault information on


time after an emergency fault occurs.

5 Guide to System Reboot

Describes how to restart the device manually when services


are interrupted because of a device fault and the device
cannot restart automatically.

6 Emergency Maintenance
Record Table

Describes the tables that you need to fill in when performing


emergency maintenance.

7 System Upgrading
Through BIOS

Describes how to upgrade software through BIOS when the


host software program fails to start.

Conventions
Symbol Conventions
The symbols that may be found in this document are defined as follows.
Symbol

Description

DANGER

WARNING

CAUTION

Indicates a hazard with a high level of risk, which if not


avoided, will result in death or serious injury.
Indicates a hazard with a medium or low level of risk, which
if not avoided, could result in minor or moderate injury.
Indicates a potentially hazardous situation, which if not
avoided, could result in equipment damage, data loss,
performance degradation, or unexpected results.

TIP

Indicates a tip that may help you solve a problem or save


time.

NOTE

Provides additional information to emphasize or


supplement important points of the main text.

General Conventions
The general conventions that may be found in this document are defined as follows.

Convention

Description

Times New Roman

Normal paragraphs are in Times New Roman.

Huawei Proprietary and Confidential


Copyright Huawei Technologies Co., Ltd.

Issue 01 (2009-04-15)

Quidway S9300 Terabit Routing Switch


Emergency Maintenance

About This Document

Convention

Description

Boldface

Names of files, directories, folders, and users are in


boldface. For example, log in as user root.

Italic

Book titles are in italics.

Courier New

Examples of information displayed on the screen are in


Courier New.

Command Conventions
The command conventions that may be found in this document are defined as follows.
Convention

Description

Boldface

The keywords of a command line are in boldface.

Italic

Command arguments are in italics.

[]

Items (keywords or arguments) in brackets [ ] are optional.

{ x | y | ... }

Optional items are grouped in braces and separated by


vertical bars. One item is selected.

[ x | y | ... ]

Optional items are grouped in brackets and separated by


vertical bars. One item is selected or no item is selected.

{ x | y | ... }*

Optional items are grouped in braces and separated by


vertical bars. A minimum of one item or a maximum of all
items can be selected.

[ x | y | ... ]*

Optional items are grouped in brackets and separated by


vertical bars. Several items or no item can be selected.

&<1-n>

The parameter before the & sign can be repeated 1 to n times.

A line starting with the # sign is comments.

GUI Conventions
The GUI conventions that may be found in this document are defined as follows.
Convention

Description

Boldface

Buttons, menus, parameters, tabs, window, and dialog titles


are in boldface. For example, click OK.

>

Multi-level menus are in boldface and separated by the ">"


signs. For example, choose File > Create > Folder.

Keyboard Operations
The keyboard operations that may be found in this document are defined as follows.
Issue 01 (2009-04-15)

Huawei Proprietary and Confidential


Copyright Huawei Technologies Co., Ltd.

Quidway S9300 Terabit Routing Switch


Emergency Maintenance

About This Document

Format

Description

Key

Press the key. For example, press Enter and press Tab.

Key 1+Key 2

Press the keys concurrently. For example, pressing Ctrl+Alt


+A means the three keys should be pressed concurrently.

Key 1, Key 2

Press the keys in turn. For example, pressing Alt, A means


the two keys should be pressed in turn.

Mouse Operations
The mouse operations that may be found in this document are defined as follows.
Action

Description

Click

Select and release the primary mouse button without moving


the pointer.

Double-click

Press the primary mouse button twice continuously and


quickly without moving the pointer.

Drag

Press and hold the primary mouse button and move the
pointer to a certain position.

Update History
Updates between document issues are cumulative. Therefore, the latest document issue contains
all updates made in previous issues.

Updates in Issue 01 (2009-04-15)


Initial commercial release.

Huawei Proprietary and Confidential


Copyright Huawei Technologies Co., Ltd.

Issue 01 (2009-04-15)

Quidway S9300 Terabit Routing Switch


Emergency Maintenance

1 Overview of Emergency Maintenance

Overview of Emergency Maintenance

About This Chapter


This chapter describes the definition of emergency events, the guidelines, flowchart and
precautions of emergency maintenance.
1.1 Definition of Emergency Maintenance
This section describes the definition and functions of emergency maintenance.
1.2 Definition of Emergencies
This section describes the definition and category of emergencies.
1.3 Initiation of Emergency Maintenance
This section describes the initiation for emergency maintenance.
1.4 Guidelines for Emergency Maintenance
This section describes the guidelines for emergency maintenance.
1.5 Flow for Emergency Maintenance
This section describes the flowchart of emergency maintenance.
1.6 Emergency Maintenance Precautions
This section describes the precautions for emergency maintenance.
1.7 Technical Support
This section describes how to seek Huawei technical support.

Issue 01 (2009-04-15)

Huawei Proprietary and Confidential


Copyright Huawei Technologies Co., Ltd.

1-1

1 Overview of Emergency Maintenance

Quidway S9300 Terabit Routing Switch


Emergency Maintenance

1.1 Definition of Emergency Maintenance


This section describes the definition and functions of emergency maintenance.
Emergency maintenance refers to rectifying an emergency and unexpected fault (such as powerdown and service interruption) of a system or a device to enable it to resume normal operation
and to minimize the loss.
Measures for emergency maintenance also guide maintenance personnel to take preventive
measures to protect the system before a surge in traffic.
This document describes how to perform emergency maintenance for the Quidway S9300
Terabit Routing Switch.
The Huawei Quidway S9300 Terabit Routing Switch (hereafter referred to as the Terabit Routing
Switch) is applied to the access layer, convergence layer, and transport layer of the Metro
Ethernet networks. The Terabit Routing Switchs include S9312, S9306, and S9303.

1.2 Definition of Emergencies


This section describes the definition and category of emergencies.
An emergency refers to the faults that occur unexpectedly, involve a wide range of devices or
services, and affect network operation and service quality. For the S9300, these faults are:
l

Abnormal system: All services are interrupted.

Abnormal Switch Routing Unit (SRU) or Main Control Unit (MCU): All services are
interrupted.

Abnormal service card: Some services are interrupted.

Abnormal service module: Some services are interrupted.

Abnormal network: Network services are interrupted.

Generally, alarms and logs about an abnormality are displayed before an emergency arises. You
can determine whether an emergency occurs by checking either alarms and logs or a complaint
of a customer.
NOTE

The roadmap of emergency maintenance described in this chapter applies to emergencies. For common
troubleshooting, refer to the Quidway S9300Terabit Routing Switch Troubleshooting.

1.3 Initiation of Emergency Maintenance


This section describes the initiation for emergency maintenance.
The causes of an emergency include a software or hardware failure, an incorrect setting, improper
maintenance operation, a line failure, or a natural disaster. Then emergency maintenance is
initiated in either of the following situations:
l

1-2

Complaints of customers
Huawei Proprietary and Confidential
Copyright Huawei Technologies Co., Ltd.

Issue 01 (2009-04-15)

Quidway S9300 Terabit Routing Switch


Emergency Maintenance

1 Overview of Emergency Maintenance

A complaint of a customer is a main cause for emergency maintenance. When a fault


reported by a customer or the Customer Service Center conforms to the conditions in
Definition of Emergencies. initiate the emergency maintenance.
l

Alarm indication
When you check the alarms output by the Network Management System (NMS) or
displayed on the terminal, initiate the emergency maintenance if the alarms possibly cause
a wide range of service failures.

Natural disaster
When a natural disaster such as an earthquake, a fire, or a flood happens, it is required to
temporarily power off devices to prevent them from damages. Therefore, the emergency
maintenance need be initiated. Then power on the devices again after the disaster.

1.4 Guidelines for Emergency Maintenance


This section describes the guidelines for emergency maintenance.
Emergent faults easily cause network access failures of numerous users, device breakdown, and
service interruption, posing great damage. To improve the efficiency in handling an emergent
fault and to minimize losses, you must comply with the following basic guidelines before
maintaining the S9300:
l

To keep the stable running of a device and minimize the probability of emergencies, refer
to the Quidway S9300Terabit Routing Switch Routine Maintenance.

The core function of emergency maintenance is to recover system operation and service
provisioning as soon as possible. To respond to an emergency, you must have ready plans
to cope with various emergencies according to the emergency maintenance manual.
Managers and maintenance personnel must be familiar with the plans and well-trained.

The maintenance personnel must attend the emergency maintenance training, which is
mandatory for maintenance personnel. You must learn the basic methods of identifying
emergent faults and how to handle them.

When an emergency occurs, keep calm and check whether the hardware devices and the
routing are working normally. Then check whether the emergency is caused by an
S9300. If it is caused by the S9300, handle the fault according to the prepared schemes or
the procedures in this manual.

The CF card contains important data. When an emergency occurs, do not format the CF
card before consulting Huawei engineers.

Contact the Customer Service Center or the local office of Huawei early for technical
support during troubleshooting.

After handling an emergent fault, collect alarm information related to this fault and send
the fault handling report, device alarm files, and log files to Huawei for analysis. This can
help Huawei to improve the after-sales service.

1.5 Flow for Emergency Maintenance


This section describes the flowchart of emergency maintenance.

Issue 01 (2009-04-15)

Huawei Proprietary and Confidential


Copyright Huawei Technologies Co., Ltd.

1-3

Quidway S9300 Terabit Routing Switch


Emergency Maintenance

1 Overview of Emergency Maintenance


NOTE

You must maintain detailed records of operations and results for further reference by Huawei engineers
during troubleshooting so that they can handle a fault quickly.

When a fault persists, contact Huawei Customer Service Center. For contact information, see Technical
Support.

The main purpose of emergency maintenance is to recover a system as soon as possible. Figure
1-1 shows the flowchart of emergency maintenance.
Figure 1-1 Flowchart of emergency maintenance
Start
Notify Huawei of the
Emergency
Locate the Fault
Collect fault
information
Rectify the Fault

Service recover?

No

Obtain help

Yes
Check the handling
result
Record information
about emergency
maintenance
End

1.5.1 Notifying Huawei of the Emergency


1.5.2 Locating the Fault
1.5.3 Collecting Fault Information
1.5.4 Rectifying the Fault
1.5.5 Obtaining Help
1.5.6 Checking the Handling Result
1.5.7 Recording Information About Emergency Maintenance
1-4

Huawei Proprietary and Confidential


Copyright Huawei Technologies Co., Ltd.

Issue 01 (2009-04-15)

Quidway S9300 Terabit Routing Switch


Emergency Maintenance

1 Overview of Emergency Maintenance

1.5.1 Notifying Huawei of the Emergency


When an emergency occurs, contact Huawei immediately for technical support.
NOTE

Even if you can independently complete emergency maintenance with the guidance of this manual, notify
Huawei of the emergency. Then Huawei technical personnel maintain records of the fault to improve aftersales services.

1.5.2 Locating the Fault


When an emergency occurs, identify the nature of the fault by referring the complaint of a
customer and alarm information. An emergency can be any of the following types:
l

Abnormal system: All services are interrupted.

Abnormal Switch Routing Unit (SRU) or Main Control Unit (MCU): All services are
interrupted.

Abnormal service card: Some services are interrupted.

Abnormal service module: Some services are interrupted.

Abnormal network: Network services are interrupted.

1.5.3 Collecting Fault Information


When an emergency occurs, collect and back up information about the fault on time and provide
it to Huawei engineers when seeking technical support.
For details about fault information collection, see Guide to Fault Information Collection.

Recording Basic Fault Information


Collect the following basic information:
l

Specific time when the fault occurs

Detailed description of the fault

Software version of the S9300

Measures taken after the fault and the results

Severity level of the problem and expected time of system recovery

Backing Up Device Fault Information


Back up the following information:
l

Indicator status of the boards, power modules, and fans

Device alarms

Device logs

Device configuration

Device debugging information if the debugging is enabled

Issue 01 (2009-04-15)

Huawei Proprietary and Confidential


Copyright Huawei Technologies Co., Ltd.

1-5

Quidway S9300 Terabit Routing Switch


Emergency Maintenance

1 Overview of Emergency Maintenance

1.5.4 Rectifying the Fault


Check whether it is a hardware fault such as a device breakdown or a software fault such as a
service failure according to the flowchart shown in Figure 1-2 and the identifying methods listed
in Table 1-1.
Figure 1-2 Flowchart for identifying the type of a fault
Start

Can log in
through the
console
interface?

No

Yes
System starts
normally?

No

Yes
Board status is
normal?

No

Yes
Interface status
is normal?

No

Yes
A service fault occurs

A device fault occurs

Table 1-1 Methods of identifying the fault type

1-6

Item

Identifying Method

Login through
the console
interface

Connect the COM port of the PC or terminal to the console interface of the
S9300 with a standard RS-232 configuration cable and set relevant
parameters correctly on the terminal. For details, refer to the Quidway
S9300 Terabit Routing Switch Configuration Guide - Basic
Configurations. Check that a terminal displays normally, for example,
<Quidway> is available on the terminal.

System startup

Check whether the system starts normally. If the command prompt such as
<Quidway> is displayed, it means that the system starts normally.

Huawei Proprietary and Confidential


Copyright Huawei Technologies Co., Ltd.

Issue 01 (2009-04-15)

Quidway S9300 Terabit Routing Switch


Emergency Maintenance

1 Overview of Emergency Maintenance

Item

Identifying Method

Board status

Run the display device command on the terminal to check whether the status
of all boards is Normal. In the case of a local fault, check the status of the
service board connected to the user who reports the fault. For example:
<Quidway> display device
S9312's Device status:
Slot Sub Type Online
Primary
- - - - - - - - - - - - - 9
LPU
Present
13
SRU
Present
Master

Power

Register

Alarm

- - - - - - - - - - - - - - - - - - - PowerOn
PowerOn

Registered
Registered

Normal
Normal

NA

Run the display interface command on the terminal to check whether the
status of the interface connected to the user who reports the fault is Up and
whether more packets are transmitted and received on the interface during
a specified period. For example:

Interface
status

<Quidway> display interface GigabitEthernet 1/0/12


GigabitEthernet1/0/12 current state : UP
Description:HUAWEI, Quidway Series, GigabitEthernet1/0/12
Interface
Switch Port,PVID :
1,The Maximum Frame Length is 1526
IP Sending Frames' Format is PKTFMT_ETHNT_2, Hardware address is
0018-2000-0083
Speed : 1000, Loopback: NONE
Duplex: FULL, Negotiation: ENABLE
Mdi
: NORMAL
Last 300 seconds input rate 0 bits/sec, 0 packets/sec
Last 300 seconds output rate 616 bits/sec, 0 packets/sec
Input: 0 packets, 0 bytes
Unicast:
0, NUnicast:
0
Discard:
0, Error
:
0
Jumbo :
0
Output: 191636 packets, 18992248 bytes
Unicast:
12, NUnicast:
191624
Discard:
19, Error
:
0
Jumbo :
0

After you identify the fault type, see 2 Emergency Maintenance for Device Faults and 3
Emergency Maintenance for Service Faults to proceed with emergency maintenance.

1.5.5 Obtaining Help


Obtain Huawei technical support according to the contact information given in Technical
Support.

1.5.6 Checking the Handling Result


After services resume, check the device status, board indicators, and alarms to confirm that the
system runs normally. Make a dialing test to prove that services are normal. For detailed
operations, refer to the Quidway S9300Terabit Routing Switch Routine Maintenance.
It is recommended to arrange technical personnel to monitor the system running during the
service peak time so that further problems can be handled immediately.

1.5.7 Recording Information About Emergency Maintenance


Record the following information about emergency maintenance for a further analysis:
Issue 01 (2009-04-15)

Huawei Proprietary and Confidential


Copyright Huawei Technologies Co., Ltd.

1-7

1 Overview of Emergency Maintenance


l

Time of emergency maintenance

Version information

Fault symptom

Handling procedure and result

Quidway S9300 Terabit Routing Switch


Emergency Maintenance

For the format of an information record table, refer to Appendix A Emergency Maintenance
Record Table.
You need to record the output information during emergency maintenance by using the Capture
Text function of the HyperTerminal or the related functions of other Telnet terminals.

1.6 Emergency Maintenance Precautions


This section describes the precautions for emergency maintenance.
To ensure the security of the device and safety of the operators, comply with the following
guidelines.

Static Electricity
Wear an ESD wrist strap before operating a board or the backplane, and follow these rules:
l

When you replace a board,

Perform active/standby switchover if the board to be replaced is an active SRU/MCU.


After the active/standby switchover, remove the board. The standby SRU/MCU can be
removed directly without active/standby switchover.

When the board to be replaced is a standby SRU/MCU, an LPU, or a CMU, run the
power off slot slot-id command to power off the board, and then remove the board.

Always hold the board in an antistatic bag before installing it.

Always place the removed board in an antistatic bag.

Laser/LED
When you maintain a device with an optical module or optical interface, follow these rules:
l

Do not look straight into the optical fiber from which the light beam shoots out when you
install and maintain the optical fiber.

Do not look straight into the connector of the optical fiber from which the light beam shoots
out when you replace the pluggable optical module.

Only the qualified personnel who have attended training can operate the optical module
and optical fiber.

CAUTION
When you install and maintain the optical fiber, keep the connector of the optical fiber clean,
unfolded, and straight.

1-8

Huawei Proprietary and Confidential


Copyright Huawei Technologies Co., Ltd.

Issue 01 (2009-04-15)

Quidway S9300 Terabit Routing Switch


Emergency Maintenance

1 Overview of Emergency Maintenance

1.7 Technical Support


This section describes how to seek Huawei technical support.
If a fault persists after the maintenance personnel performs emergency maintenance according
to the flowchart, contact Huawei Technical Support Center or the local office for technical
support.
NOTE

Huawei Technologies Co. Ltd provides 24-hour technical support services.

You can contact Huawei Technical Support Center at:


l

Telephone: +86-755-28560000

Fax: +86-755-28560111

Website: http://support.huawei.com

Email: support@huawei.com
NOTE

For contact information about local offices, log in to http://support.huawei.com.

For ease of contacting technical support personnel, it is recommended to make a phone directory and
mark it on the maintenance site. The phone directory can contain contact information about the superior
maintenance personnel, Huawei engineers, transmission office maintenance personnel, and remote
office maintenance personnel. At least two contact methods of each person must be provided.

The maintenance personnel need maintain a detailed record of the emergency maintenance
procedures, notify Huawei of the type of the board to be replaced, and apply for a spare one
according to the warranty articles. The fault can thus be removed sooner. The fax can adopt the
format of the Notice of Emergency Maintenance. For the details, refer to Appendix A
Emergency Maintenance Record Table.

Issue 01 (2009-04-15)

Huawei Proprietary and Confidential


Copyright Huawei Technologies Co., Ltd.

1-9

Quidway S9300 Terabit Routing Switch


Emergency Maintenance

2 Emergency Maintenance for Device Faults

Emergency Maintenance for Device Faults

About This Chapter


This chapter describes the flowchart and directions of handling device faults.
2.1 Overview
This section describes the definition and types of device faults.
2.2 Flow for Handling Device Faults
This section describes the flowchart for handling device faults.
2.3 Directions for Emergency Maintenance
This section describes the flowchart and procedure for handling a device fault.

Issue 01 (2009-04-15)

Huawei Proprietary and Confidential


Copyright Huawei Technologies Co., Ltd.

2-1

2 Emergency Maintenance for Device Faults

Quidway S9300 Terabit Routing Switch


Emergency Maintenance

2.1 Overview
This section describes the definition and types of device faults.
A device fault refers to a hardware failure of a device. To rectify a device fault, you must reset,
repair, or replace the relevant hardware.
During the running of a device, you can determine that a device fault occurs and initiate the
emergency maintenance in either of the following cases:
l

You fail to log in to the system through the console interface.

You fail to start the system.

The board status is abnormal.

The interface status is abnormal.

2.2 Flow for Handling Device Faults


This section describes the flowchart for handling device faults.
The roadmap of the emergency maintenance for device faults is as follows:
1.

Check the status of the integrated system.

2.

Check the board status.

3.

Check the interface status.

Figure 2-1 shows the flowchart for handling device faults.

2-2

Huawei Proprietary and Confidential


Copyright Huawei Technologies Co., Ltd.

Issue 01 (2009-04-15)

Quidway S9300 Terabit Routing Switch


Emergency Maintenance

2 Emergency Maintenance for Device Faults

Figure 2-1 Flowchart for handling device faults


Start

Can log in
through the
console
interface?

No

Handle the failed


system login
through the console
interface

Yes
System starts
normally?

No

Handle the failed


system start

No

Handle the
abnormality of the
board status

No

Handle the
abnormality of the
interface status

Yes
Board status is
normal?
Yes
Interface
status is
normal?
Yes
Proceed to the flow
for handling service
faults

2.3 Directions for Emergency Maintenance


This section describes the flowchart and procedure for handling a device fault.
2.3.1 Failure to Log In to a System Through the Console Interface
2.3.2 Failure to Start a System
2.3.3 Abnormality of the Board Status
2.3.4 Abnormality of the Interface Status

2.3.1 Failure to Log In to a System Through the Console Interface


Fault Description
After the COM port of a PC or terminal is connected to the console interface of a S9300 with a
standard RS-232 configuration cable and the relevant parameters are set, nothing is displayed
on the terminal.
Issue 01 (2009-04-15)

Huawei Proprietary and Confidential


Copyright Huawei Technologies Co., Ltd.

2-3

Quidway S9300 Terabit Routing Switch


Emergency Maintenance

2 Emergency Maintenance for Device Faults

Fault Information Collection


If you are unable to log in to a system through the console interface, collect the following
information besides that described in Guide to Fault Information Collection for future
reference.
Table 2-1 Collection of information about the failure to log in to a system through the console
interface

2-4

No.

Collecting Item

Collecting Method

Communication
parameters of the
COM port

Check the communication parameters of the COM port such


as the Windows-based HyperTerminal, including the bard
rate, data bit, parity check or not, stop bit, and flow control or
not.

Indicator status

Check the status of the following indicators:


l

RUN, ALM, and ACT indicators of the MCU/SRU

RUN, ALM, and FAULT indicators of the power modules

Status indicators of the fans

Huawei Proprietary and Confidential


Copyright Huawei Technologies Co., Ltd.

Issue 01 (2009-04-15)

Quidway S9300 Terabit Routing Switch


Emergency Maintenance

2 Emergency Maintenance for Device Faults

Handling Flowchart
Figure 2-2 Flowchart for handling the failed login to a system through the console interface
start

Parameters of
the COM
interface are
correct?

No

Modify the
parameters

Fault rectified?

Yes

No

Yes

Cable is in good
condition?

No

Replace the cable

Fault rectified?

Yes

No
Yes

Power module runs


normally?

No

Repair the power


supply system

Yes

No

Yes

The SRU/MCU
runs normally?

Fault rectified?

No

Exchange replace
the SRU/MCU

Fault rectified?

Yes

No
Yes
Yes
Reset the system

Fault rectified?
No

Seek technical
support

Issue 01 (2009-04-15)

Huawei Proprietary and Confidential


Copyright Huawei Technologies Co., Ltd.

End

2-5

2 Emergency Maintenance for Device Faults

Quidway S9300 Terabit Routing Switch


Emergency Maintenance

CAUTION
All the following steps can be performed only when the user services are already interrupted. If
the user services are not interrupted, collect fault information and provide feedback to Huawei
engineers for further processing.

Procedure
Step 1 Check and modify the parameters of the COM port.
Check whether the parameters of the COM port are identical with those of the console interface
on the S9300. If the parameters are not identical, modify the parameters of the COM port.
By default, the console interface of the S9300 adopts 9600 bps as the baud rate, 8 as the data
bit, 1 as the stop bit, no parity check, and no flow control.
NOTE

When the parameters of the console interface are modified, adopt the modification.

Step 2 Check and replace the cable.


If the parameters of the COM port are correct, check whether the cable is in good condition.
You can replace the cable with a new one to check that you can normally log in.
Step 3 Check and repair the power supply system.
When you find that the indicators of all the boards are off and all the fans fail to work (can be
determined by listening to fans rotating), or the ALM indicator of the power module is on, the
power supply system of the device is possibly faulty and need repairs.
The power supply system consists of the following:
l

Power supply system of the equipment room, chassis, or cabinet

Power module

Power supply system of the backplane

You can check the power supply system as follows:


l

Check that the power module is switched on. When there are multiple power modules, ensure
that at least one works normally.

Check whether the ALM indicator of the power module is on. If so, it indicates that the power
module is faulty. You can replace the power module to solve the problem.

When no problem is found after the preceding checking, but the power supply system fails
to work, see Technical Support for Huawei technical support.

Step 4 Exchange and replace the MCU/SRU.


After you confirm that the parameters of the COM port are correctly set, the cable is in good
condition, and the power supply system works normally, the MCU/SRU is possibly faulty. When
there are active and standby MCUs/SRUs, you can connect the configuration cable to the standby
MCU/SRU; when there is only one MCU/SRU, you can replace it with a spare one.
Step 5 Reset the system.
2-6

Huawei Proprietary and Confidential


Copyright Huawei Technologies Co., Ltd.

Issue 01 (2009-04-15)

Quidway S9300 Terabit Routing Switch


Emergency Maintenance

2 Emergency Maintenance for Device Faults

After you perform the preceding steps, you can reset the system if the fault persists. You can
switch off the power and switch on the power module after three minutes to reset the system.
Step 6 Seek technical support.
For seeking Huawei technical support, see Technical Support.
----End

2.3.2 Failure to Start a System


Fault Description
A system fails to start and the terminal runs as follows:
l

The terminal displays a message indicating that initialization fails.

The terminal stops at the file decompression state for a long period.

The system restarts continuously.

Fault Information Collection


If you are unable to start a system, collect the following information besides the generic
information described in Guide to Fault Information Collection for future reference.
Table 2-2 Collection of information about the failure to start a system

Issue 01 (2009-04-15)

No.

Collecting Item

Collecting Method

Information about
system startup

Use the Capture Text function of the HyperTerminal or the


related functions of other Telnet terminals to record
information about system startup through a COM port or
Telnet terminal.

Name of the startup file

Check the name of the startup file through the Basic Input/
Output System (BIOS) menu.

Huawei Proprietary and Confidential


Copyright Huawei Technologies Co., Ltd.

2-7

Quidway S9300 Terabit Routing Switch


Emergency Maintenance

2 Emergency Maintenance for Device Faults

Handling Flowchart
Figure 2-3 Flowchart for handling the failed system start
Start

The Cfcard
self-test fails?

Yes

Plug in and out


the Cfcard

Fault rectified?

Yes

No
No

Replace the
Cfcard

Fault rectified?

Yes

No
The module
self-test fails?

Yes

Debug or replace
the SRU/MCU

Yes Re-upload the startup


files through BIOS

Yes
Fault rectified?
No

No

System continuously
restarts?

Yes

No

No

File is incorrectly
decompressed?

Fault rectified?

Yes

Make the startup files


of active/standby
SRU/MCU identical

Yes
Fault rectified?
No

Seek technical
support

End

CAUTION
All the following steps can be performed only when the user services are already interrupted. If
the user services are not interrupted, collect fault information and provide feedback to Huawei
engineers for further processing.

2-8

Huawei Proprietary and Confidential


Copyright Huawei Technologies Co., Ltd.

Issue 01 (2009-04-15)

Quidway S9300 Terabit Routing Switch


Emergency Maintenance

2 Emergency Maintenance for Device Faults

Procedure
Step 1 Remove and insert the CF card.
If the "CF Card Init.....FAIL!" message is displayed, the CF card may be held loosely. You can
try the following operations to solve the problem:
1.

Remove the MCU/SRU.

2.

Remove the CF card, and then insert it.

3.

Re-insert the MCU/SRU.

Step 2 Replace the CF card.


If the fault cannot be rectified after the CF card is re-inserted, you need to replace the CF card.
Step 3 Replace the MCU/SRU.
When either the system prompts "Initializing module IPC_VP_CHANNEL.................FAIL!",
or the memory self-test still fails after you perform Steps 1 and 2, the MCU/SRU is possibly
faulty. You can try to replace the MCU/SRU. When there is only one MCU/SRU, you can replace
it with a spare one.
Step 4 Upload the startup file through BIOS again.
When the system stops at the phase of file decompression or continuously restarts, the startup
file is possibly incorrect or damaged. You can try to upload the startup file through BIOS.
It is complicated to upload the startup file through BIOS. Contact Huawei engineers and perform
the uploading with their guidance. For the procedures, see System Upgrading Through
BIOS.
Step 5 Seek technical support.
For seeking Huawei technical support, see Technical Support.
----End

2.3.3 Abnormality of the Board Status


Fault Description
The abnormality of the board status can be determined in one or more of the following cases:
l

When you run the display device command to view information about a board, the board
status is Abnormal.

When you run the display device command to view information about a board, the board
status is Unregistered.

The RUN/ALM indicator of a board blinks at a frequency of 2 Hz or the red indicator is


on.

A board continuously restarts.

Fault Information Collection


For the abnormality of the board status, collect the following information besides the generic
information described in Guide to Fault Information Collection for future reference.
Issue 01 (2009-04-15)

Huawei Proprietary and Confidential


Copyright Huawei Technologies Co., Ltd.

2-9

Quidway S9300 Terabit Routing Switch


Emergency Maintenance

2 Emergency Maintenance for Device Faults

Table 2-3 Collection of information about the abnormality of the board status
No.

Collecting Item

Collecting Method

Indicator status of a
board

Check whether the indicator of a board is off, is on, blinks


at a frequency of 2 Hz, or blinks at a frequency of 1 Hz.

Detailed information
about a board

Check detailed information about a board by using the


display device slot-id command.

Handling Flowchart
Figure 2-4 Flowchart for handling the abnormality of the board status
Start

Reset the board

Yes

Fault rectifyed?
No
Replace the board

Fault rectifyed?

No

Cut over the services


on the board and seek
technical support

Yes
End

CAUTION
All the following steps can be performed only when the user services are already interrupted. If
the user services are not interrupted, collect fault information and provide feedback to Huawei
engineers for further processing.

Procedure
Step 1 Reset the board.
2-10

Huawei Proprietary and Confidential


Copyright Huawei Technologies Co., Ltd.

Issue 01 (2009-04-15)

Quidway S9300 Terabit Routing Switch


Emergency Maintenance

2 Emergency Maintenance for Device Faults

It is complicated to handle the abnormality of the board status. In an emergency situation, it is


recommended to solve the problem by resetting or replacing the board. For other maintenance
measures, such as fault location, contact Huawei engineers.
You can reset a board by using the reset slot command, pressing the RESET button on the panel,
or plugging in/out the board.
Step 2 Replace the board.
When resetting the board fails to solve the problem, you can try to replace the board with a spare
one.
Step 3 Cut over the services on the board and seek technical support.
After you perform steps 1 and 2, but the fault persists, you can cut over the services on the faulty
board to a board that is running normally or in an idle slot. For the cutover operations, contact
Huawei engineers or perform the cutover according to the cutover scheme of the customer.
In addition, provide fault information to the local office for technical support.
----End

2.3.4 Abnormality of the Interface Status


Fault Description
The abnormality of the interface status can be determined in one or more of the following cases:
l

When you run the display interface command to view the status of an interface, the
interface status is DOWN.

When you run the display interface command to view the status of an interface, the number
of the sent and received packets on the interface remains the same.

The indicator status of an interface is abnormal. For example, the LINK indicator of the
interface is off.

Fault Information Collection


For the abnormality of the interface status, collect the following information besides the generic
information described in Guide to Fault Information Collection for future reference.
Table 2-4 Collection of information about the abnormality of the interface status

Issue 01 (2009-04-15)

No.

Collecting Item

Collecting Method

Indicator status of an
interface

Check whether the indicator status of an interface is off, is


on, blinks at a frequency of 2 Hz, or blinks at a frequency
of 1 Hz.

Detailed information
about an interface

Collect detailed information about an interface by using


the display interface interface-type interface-number
command.

Brief IP-related
information about an
interface

Collect brief IP-related information about an interface by


using the display ip interface brief command.

Huawei Proprietary and Confidential


Copyright Huawei Technologies Co., Ltd.

2-11

Quidway S9300 Terabit Routing Switch


Emergency Maintenance

2 Emergency Maintenance for Device Faults

No.

Collecting Item

Collecting Method

Brief information about


all interfaces

Collect brief information about all interfaces by using the


display interface brief command.

Handling Flowchart
Figure 2-5 Flowchart for handling the abnormality of the interface status
Start

Yes
Status of
interface indicator
normal?

Proceed to the flow for


handling service faults

No
Interface status
is Up?

No

Is manually shut
down?

Yes

Yes

Shut up the interface

No
Yes
Detect the link

Fault rectified?

End

No
Packets are
transeived
normally?

No

Cut over the services


on the board and seek
technical support

Perform a local
loopback test

Yes

Reset the interface

Fault rectified?

No

Is the status
normal?

Yes

Check and modify the


configuration of the data
link layer or the upper
layer protocol

No

Yes
End

2-12

Huawei Proprietary and Confidential


Copyright Huawei Technologies Co., Ltd.

Issue 01 (2009-04-15)

Quidway S9300 Terabit Routing Switch


Emergency Maintenance

2 Emergency Maintenance for Device Faults

CAUTION
All the following steps can be performed only when the user services are already interrupted. If
the user services are not interrupted, collect fault information and provide feedback to Huawei
engineers for further processing.

Procedure
Step 1 Start the interface.
When you find that an interface is shut down through the shutdown command by checking the
configuration, you can run the undo shutdown command in the interface view to start it.
Step 2 Detect the link.
Before detecting a link, check whether the LINK indicator of the interface is on.
If so, it indicates that the physical link is Up and you can detect the link as follows:
1.

Check that the interface parameters at both ends of the link are identical, such as the duplex
mode and rate.

2.

When the interfaces are optical ones, check whether the receiving and sending optical
powers at both ends are normal by using the optical power meter. When you find that either
end only sends or receives data, the optical module is possibly faulty or the optical fiber
possibly fails to match the optical module. Then you can try to replace the optical module
or the optical fiber.

DANGER
Do not look straight into the optical fiber from which the light beam shoots out reversely along
a beam of light when you check the receiving and sending optical powers. You must use the
optical power meter to measure the optical power.
When the LINK indicator of the interface is off, you can check the link as follows:
1.

Perform a physical loopback test on the device. That is, connect the faulty interface to
another interface that is in the normal state with an optical fiber or cable in good condition.

2.

When the LINK indicator is on, it indicates that the interface runs normally. You need
check whether the optical fiber or the cable is damaged and whether the trunk link runs
normally. In this case, the neighboring office is required to cooperate.

3.

If the LINK indicator is off, it indicates that the interface hardware is faulty. When a
pluggable optical module is used, you can replace the optical module; otherwise, you can
cut over the services from the faulty interface to another interface that runs normally.

Step 3 Perform a local loopback test.


When the interface status is Up, but the number of sent and received packets on the interface
remains the same during a long period, it indicates that the interface neither receives nor sends
any packets. Then you can run the loopback local command on the interface to perform a local
loopback test and test data sending and receiving by using the ping command to view the change
in the number of sent and received packets.
Issue 01 (2009-04-15)

Huawei Proprietary and Confidential


Copyright Huawei Technologies Co., Ltd.

2-13

2 Emergency Maintenance for Device Faults

Quidway S9300 Terabit Routing Switch


Emergency Maintenance

NOTE

After the local loopback test is complete, run the undo loopback command to disable the local loopback
immediately.

Step 4 Check and modify the configurations of the data link layer or the upper layer protocols.
If the interface still fails to send and receive packets in the local loopback test, check the
configuration of the data link layer or the upper layer protocols. For example, check that the
configurations of the Point-to-Point Protocol (PPP) or the High level Data Link Control protocol
at both ends are identical and the routing protocols run normally.
Step 5 Reset the interface.
After you perform the preceding steps, you can reset the interface if the fault persists.
To reset an interface, run the shutdown and undo shutdown commands.
Step 6 Contact Huawei technical support personnel.
For seeking Huawei technical support, see 1.7 Technical Support.
----End

2-14

Huawei Proprietary and Confidential


Copyright Huawei Technologies Co., Ltd.

Issue 01 (2009-04-15)

Quidway S9300 Terabit Routing Switch


Emergency Maintenance

3 Emergency Maintenance for Service Faults

Emergency Maintenance for Service Faults

About This Chapter


This chapter describes the flowchart and directions for handling service faults.
3.1 Overview
This section describes the definition and types of service faults.
3.2 Flow for Handling Service Faults
This section describes the flowchart for handling service faults.
3.3 Guide to Emergency Maintenance
This section describes the flowchart and procedure for handling a service fault.

Issue 01 (2009-04-15)

Huawei Proprietary and Confidential


Copyright Huawei Technologies Co., Ltd.

3-1

3 Emergency Maintenance for Service Faults

Quidway S9300 Terabit Routing Switch


Emergency Maintenance

3.1 Overview
This section describes the definition and types of service faults.
A service fault refers to the partial or global service congestion due to a software or network
fault. You can handle a service fault by modifying service configuration, resetting service
modules, or restoring network connections.
NOTE

Generally, a hardware fault may result in service interruption. For the handling of a device fault, see
Emergency Maintenance for Device Faults.

This chapter describes the emergency maintenance for service faults, focusing on fault clearance
and prompt service recovery rather than fault rectification. To locate, handle, and rectify common
service faults, refer to the Quidway S9300Terabit Routing Switch Troubleshooting.
For the S9300, emergent service faults that commonly occur fall into the following:
l

Failure to forward IP unicast packets

Failure to forward IP multicast packets

Failure to forward MPLS VPN packets


NOTE

MPLS VPN = Multi-Protocol Label Switching Virtual Private Network

3.2 Flow for Handling Service Faults


This section describes the flowchart for handling service faults.
Figure 3-1 shows the flowchart for handling service faults.

3-2

Huawei Proprietary and Confidential


Copyright Huawei Technologies Co., Ltd.

Issue 01 (2009-04-15)

Quidway S9300 Terabit Routing Switch


Emergency Maintenance

3 Emergency Maintenance for Service Faults

Figure 3-1 Flowchart for handling service faults


Start

Fault involves
all users?

Yes

Handle the fault

No

Fault involves
users on certain
board?

Yes

Handle the fault

No
Fault involves Yes
users on certain
interface?

Handle the fault

No
Fault involves
users of certain
type?

Yes

Handle the fault

No
Fault involves
single users?

Yes

Proceed to the
troubleshooting flow

No
End

NOTE

For a fault affects a single user, you do not need to initiate the emergency maintenance. For the common
handling flowchart of a fault, refer to the Quidway S9300Terabit Routing Switch Troubleshooting.

3.3 Guide to Emergency Maintenance


This section describes the flowchart and procedure for handling a service fault.
3.3.1 Failure to Forward IP Unicast Packets
3.3.2 Failure to Forward IP Multicast Packets
3.3.3 Failure to Forward MPLS VPN Packets
Issue 01 (2009-04-15)

Huawei Proprietary and Confidential


Copyright Huawei Technologies Co., Ltd.

3-3

Quidway S9300 Terabit Routing Switch


Emergency Maintenance

3 Emergency Maintenance for Service Faults

3.3.1 Failure to Forward IP Unicast Packets


Fault Description
Certain unicast packets on a network cannot be forwarded.

Fault Information Collection


If IP unicast packets cannot be forwarded, collect the following information besides the generic
information described in Guide to Fault Information Collection for fault location and future
reference.
Table 3-1 Collection of information about the failure to forward IP unicast packets

3-4

No.

Collecting Item

Collecting Method

Information about FIB entries

display fib

Information about certain FIB entries on


a board in a specified slot

display fib[ vpn-instance vpn-instancename ] [ | { begin | exclude | include }


regular-expression ]

Information about ARP entries on a


specified interface

display arp interface

Information about public BGP routes

display bgp routing-table

Information about the routes advertised


to and imported from a specified peer

display bgp routing-table peer ipaddress { advertised-routes | receivedroutes } [ statistics ]

Information about the establishment of


all BGP peers

display bgp peer

Information about the interface enabled


with IS-IS

display isis interface [ verbose ]

Information about IS-IS LSDBs

display isis lsdb

Configuration of mesh-groups

display isis mesh-group

10

Information about the IS-IS neighbor


relationships set up with the local end

display isis peer [ verbose ]

11

Information about the IS-IS routing table

display isis route

12

Logs of IS-IS routing calculation

display isis spf-log

13

OSPF errors

display ospf error

14

Information about the interface enabled


with OSPF

display ospf interface

15

Information about OSPF peers

display ospf peer

16

Information about OSPF LSDBs

display ospf lsdb

Huawei Proprietary and Confidential


Copyright Huawei Technologies Co., Ltd.

Issue 01 (2009-04-15)

Quidway S9300 Terabit Routing Switch


Emergency Maintenance

3 Emergency Maintenance for Service Faults

No.

Collecting Item

Collecting Method

17

Information about the OSPF routing


table

display ospf routing

18

Running status and configuration of RIP


processes

display rip

19

Information about all the activated


routes of the RIP database

display rip database

20

Information about the interface enabled


with RIP

display rip interface

21

Information about RIP neighbors

display rip neighbor

NOTE
FIB = Forwarding Information Base; ARP = Address Resolution Protocol; BGP = Border Gateway
Protocol; IS-IS = Intermediate System to Intermediate System; LSDB = Link State Database; OSPF =
Open Shortest Path First; RIP = Routing Information Protocol

Issue 01 (2009-04-15)

Huawei Proprietary and Confidential


Copyright Huawei Technologies Co., Ltd.

3-5

Quidway S9300 Terabit Routing Switch


Emergency Maintenance

3 Emergency Maintenance for Service Faults

Handling Flowchart
Figure 3-2 Flowchart for handling the failure to forward IP unicast packets
Start

Can receive
upstream
packets?

No

Recover the
uplink

Fault rectified?
No

Yes
Can forward
packets?

No

Recover the
downlink

Fault rectified?

Yes

Routing entries
are correct?

No

Reset the routing


protocol

Fault rectified?

Yes

No

No

Refresh the FIB


entries

Fault rectified?

Yes

No

Yes

Reset the system

Yes

No

Yes

Forwarding
entries are
correct?

Yes

Fault rectified?

Yes

No

Seek technical
support

End

CAUTION
All the following steps can be performed only when the user services are already interrupted. If
the user services are not interrupted, collect fault information and provide feedback to Huawei
engineers for further processing.

3-6

Huawei Proprietary and Confidential


Copyright Huawei Technologies Co., Ltd.

Issue 01 (2009-04-15)

Quidway S9300 Terabit Routing Switch


Emergency Maintenance

3 Emergency Maintenance for Service Faults

Procedure
Step 1 Check and recover the uplink.
When some unicast packets fail to be forwarded, check whether the S9300 can receive upstream
packets. You can run the display interface command to view whether the number of received
packets on the device changes. When you find that the device cannot receive any upstream
packets, perform the following:
1.

Check whether the status of the upstream interface on the S9300 is normal. For details, see
Abnormality of the Interface Status.

2.

If the status of the upstream interface is normal, ping the peer interface of the upstream
interface. When the ping is successful, you can assume that a fault occurs on the upstream
device. To recover the system, contact the site office where the upstream device resides.

3.

When the ping fails, detect the link connecting the interface on the S9300 to the upstream
device. For example, check the cable for correct positioning, the optical module and the
optical power for normality, the relay agent for normality, and the IP address for
correctness.

4.

If the fault persists after you perform the preceding steps, contact Huawei for technical
support. For seeking technical support, see Technical Support.

Step 2 Check and recover the downlink.


When the S9300 can receive incoming packets rather than send packets, check the connection
and communication between the S9300 and the downstream device as follows:
1.

Check whether the status of the downstream interface on the S9300 is normal. For details,
see Abnormality of the Interface Status.

2.

If the status of the downstream interface is normal, ping the peer interface of the downstream
interface. When the ping is successful, you can judge that a fault occurs on the downstream
device. To recover the system, contact the site office where the downstream device resides.

3.

When the ping fails, detect the link connecting the interface on the S9300 to the downstream
device. For example, check the cable for correct positioning, the optical module and the
optical power for normality, the relay agent for normality, and the IP address for
correctness.

4.

When the link is in good condition, the communication between the S9300 and the
downstream device is possibly abnormal. You need to check the configuration such as
routing according to the following step.

Step 3 Check and restore the routing entries.


If the S9300 fails to communicate with its downstream device, the routing entries are possibly
incorrect. You can try to check and restore the routing entries as follows:
1.

Check whether a route to the downstream device exists in the routing table of the S9300.
If the route does not exist, add a static route, and then check whether the ARP entries on
the downstream device can be learned.

2.

When the ARP entries on the downstream device cannot be learned, you can add static
ARP entries.

3.

If there is still no route to the downstream device in the routing table of the S9300, the
routing table is possibly oversized. You can try to delete unnecessary routing entries and
update the routing table. Then check whether the S9300 learns the route.

4.

If a route to the downstream device exists, check this routing entry for its correctness, such
as the routing protocol, subnet mask, preference, and hop count. As the troubleshooting of

Issue 01 (2009-04-15)

Huawei Proprietary and Confidential


Copyright Huawei Technologies Co., Ltd.

3-7

3 Emergency Maintenance for Service Faults

Quidway S9300 Terabit Routing Switch


Emergency Maintenance

IP routing is complicated, it is not mentioned here. For details, refer to the Quidway
S9300Terabit Routing Switch Troubleshooting - IP Routing.
5.

If the fault persists after you perform the preceding steps, reset the relevant routing protocol.
For example, reset all IS-IS connections through the reset isis all command.

6.

If resetting the relevant routing protocol is ineffective, proceed to the following step.

Step 4 Check and restore FIB entries.


If the communication fails when the routing entries are normal, the FIB entries are possibly
incorrect. You can run the display fib [ verbose ] command to check the FIB entries for their
correctness. In the case of incorrect FIB entries, update the FIB entries and deliver them again.
Step 5 Reset the system.
To solve a software problem, resetting the system is the last and most effective solution. If other
users are not affected, you can reset the system to solve the problem.
Before resetting a system by using the reboot command, save the current configurations with
the save command. If the fault impacts a small range, you can run the schedule reboot command
to reset the system in off hours such as the wee hours.
NOTE

If the system can be restarted through a software program, do not reset the system.

Step 6 Seek technical support.


For seeking Huawei technical support, see Technical Support.
----End

3.3.2 Failure to Forward IP Multicast Packets


Fault Description
You can determine that a failure to forward IP multicast packets occurs in either of the following
situations:

3-8

A multicast distribution tree (MDT) cannot be set up.

No multicast routing entry exists on the S9300 directly connected to the multicast source.

Clients fail to receive multicast data, which may be due to the incorrect configuration of
the Internet Group Management Protocol (IGMP).

The Protocol Independent Multicast (PIM) routing table has no (S, G) entry.

The multicast data can reach intermediate S9300s but not the last hop S9300.

Although an interface on an intermediate S9300 receives the multicast data, no


corresponding (S, G) entry is created in the PIM routing table.

The static Rendezvous Point (RP) fails to communicate with the dynamic RP.

Mosaics are displayed in the multicast video image on clients.

The multicast video programs displayed are asynchronous on the clients connected to
different S9300s, but the program is played fluently, without mosaics.
Huawei Proprietary and Confidential
Copyright Huawei Technologies Co., Ltd.

Issue 01 (2009-04-15)

Quidway S9300 Terabit Routing Switch


Emergency Maintenance

3 Emergency Maintenance for Service Faults

Fault Information Collection


If the IP multicast packets cannot be forwarded, collect the following information besides the
generic information described in Guide to Fault Information Collection for future reference.
NOTE

Before using the debugging command to collect debugging information, run the terminal debugging
command to enable the debugging display on a terminal, and then run the terminal monitor command
to enable the display on the terminal.

For ease of fault location, it is recommended to collect long-term debugging information.

After you collect debugging information, run the undo debugging all command to disable all the
debugging immediately.

Table 3-2 Collection of information about the failure to forward IP multicast packets

Issue 01 (2009-04-15)

No.

Collecting Item

Collecting Method

All routes learned on the S9300

display ip routing-table

PIM routing table on the S9300

display pim routing-table

Information about the unicast routes


used by PIM

display pim claimed-route

Multicast routing table on the S9300

display multicast routing-table

Multicast forwarding table on the


S9300

display multicast forwarding-table

All PIM neighbors of the S9300

display pim neighbor

All the interfaces enabled with PIM


on the S9300

display pim interface

BSR information learned by the


S9300 when PIM-SM is enabled

display pim bsr-info

RP information learned by the


S9300 when PIM-SM is enabled

display pim rp-info

10

Whether the group that wants to


receive multicast data can be mapped
to the RP when the S9300 runs PIMSM

display pim rp-info group-address

11

Information about the RPF neighbors


and interfaces of the RPF from the
S9300 to the multicast source

display multicast rpf-info source-address

12

Information about IGMP groups

display igmp group

13

Information about IGMP interfaces

display igmp interface

14

Information about the IGMP routing


table

display igmp routing-table

Huawei Proprietary and Confidential


Copyright Huawei Technologies Co., Ltd.

3-9

3 Emergency Maintenance for Service Faults

Quidway S9300 Terabit Routing Switch


Emergency Maintenance

No.

Collecting Item

Collecting Method

15

All debugging information about PIM

After you collect information by using the


debugging pim all command, disable the
debugging immediately.

16

All debugging information about


IGMP

After you collect information by using the


debugging igmp all command, disable the
debugging immediately.

NOTE
PIM-SM = Protocol Independent Multicast-Sparse Mode; RPF = Reverse Forwarding Path

3-10

Huawei Proprietary and Confidential


Copyright Huawei Technologies Co., Ltd.

Issue 01 (2009-04-15)

Quidway S9300 Terabit Routing Switch


Emergency Maintenance

3 Emergency Maintenance for Service Faults

Handling Flowchart
Figure 3-3 Flowchart for handling the failure to forward IP multicast packets
Start

User joins the


multicast group G
by IGMP?

No

Restore the IGMP


configurations

Yes

Fault rectified?
No

Yes

TTL of the
packets is big
enough to clients?

No

Yes
Modify the TTL

Fault rectified?
No

Yes

RP about group G
on all devices is
identical?

No

Restore the RP
configurations

Yes
Fault rectified?
No

Yes
Multicast routing
entries are
correct?

No

Restore the routing


protocol configurations

Yes
Fault rectified?
No

Yes

Yes
Reset the system

Fault rectified?
No
Seek technical
support

Issue 01 (2009-04-15)

Huawei Proprietary and Confidential


Copyright Huawei Technologies Co., Ltd.

End

3-11

3 Emergency Maintenance for Service Faults

Quidway S9300 Terabit Routing Switch


Emergency Maintenance

CAUTION
All the following steps can be performed only when the user services are already interrupted. If
the user services are not interrupted, collect fault information and provide feedback to Huawei
engineers for further processing.

Procedure
Step 1 Check and restore the IGMP configuration.
When clients fail to receive multicast data, check the IGMP configuration on the S9300
connecting the clients for correctness as follows:
1.

Check whether multicast is enabled on the S9300. That is, check whether the multicast
routing-enable command is run. If the command is not run, enable multicast in the system
view and ensure that IGMP is enabled on all interfaces. Then check whether the clients
succeed in receiving multicast data.

2.

If the clients still fail to receive multicast data, check whether the interface status is normal.
Run the display igmp interface interface-name command to view whether information
about the specified interface is displayed. If no information is displayed, see Abnormality
of the Interface Status to handle it; if the interface status is normal, check whether the
clients succeed in receiving multicast data.

3.

If the clients still fail to receive multicast data, check whether access control lists (ACLs)
are configured on the interface to prevent group G from joining the multicast group. Run
the display current-configuration interface interface-name command to check whether
the IGMP group policy is configured. If so, modify the ACL configuration to permit IGMP
group G to join the multicast group. Then check whether the clients succeed in receiving
multicast data.

4.

When the clients still fail to receive multicast data, check whether the interface resides on
the same network as the hosts. If the interface resides on a different network, modify the
IP address of the interface, and then check whether the clients succeed in receiving multicast
data.

5.

If the fault persists after you perform the preceding checking, run the reset igmp group
command to delete the IGMP group, and then add it again to the multicast group.

6.

If deleting the IGMP group is not effective, proceed to the following step.

Step 2 Check and modify the Time-to-Live (TTL) value of the packets sent by the multicast source.
Check the TTL value of the (S, G) packets sent by the S server. If this value is too small, it is
recommended to modify the TTL value to a larger one. The larger TTL value thus ensures the
packets reach the hosts.
Step 3 Check and modify the RP configuration.
If the fault persists after you perform the preceding steps, check the RP configuration for
correctness. First, ensure that all the devices in the PIM domain are enabled with PIM. There
are two cases:
When an RP is specified statically in the network, perform the following:
1.

3-12

Check whether the same static-rp command is run on all the devices. If the command is
not run, run the same static-rp command on all the devices, and then check the receiving
Huawei Proprietary and Confidential
Copyright Huawei Technologies Co., Ltd.

Issue 01 (2009-04-15)

Quidway S9300 Terabit Routing Switch


Emergency Maintenance

3 Emergency Maintenance for Service Faults

of multicast data. When ACLs are configured, ensure that the ACL configurations are also
the same. Then check whether the clients succeed in receiving multicast packets.
2.

Check whether ACLs are configured to prevent the static RP from serving group G. If so,
modify the ACL configuration to remove the restriction. Then check whether the clients
succeed in receiving multicast packets.

When a dynamic BSR-RP is configured in the network, perform the following:


1.

Check whether the BSR is correctly configured by running the display pim bsr-info
command on the BSR. If the BSR is not configured, re-configure the BSR.

2.

Run the display pim rp-info command on the BSR to check whether the BSR learns RP
information. If the BSR fails to learn RP information, check that the RP is correctly
configured, a route between the BSR and the RP exists, and the BSR and the RP can ping
each other. If the route is faulty, refer to the Quidway S9300Terabit Routing Switch
Troubleshooting - Multicast.

3.

Run the display current-configuration command on both the BSR and the RP to check
whether the crp-policy commands are run to prohibit group G. If so, modify the ACL
configuration.

4.

If performing this step is not effective, proceed to the following step.

Step 4 Check and restore multicast routing entries.


If the fault persists after you perform the preceding steps, routing entries are possibly faulty.
Perform the following:
1.

Check whether the multicast routing entries from the RP to the clients, from the multicast
source to the RP, and from the multicast source to the clients are correct. For details, refer
to the VRP Troubleshooting - IP Multicast.

2.

If the fault persists after you troubleshoot the multicast routing entries, reset the
corresponding multicast and unicast routing protocols. For example, reset all IS-IS
connections through the reset isis all command.

3.

If resetting the relevant routing protocols is ineffective, proceed to the following step.

Step 5 Reset the system.


To solve a software problem, resetting the system is the last and most effective solution. If other
users are not affected, you can reset the system to solve the problem.
Before resetting a system, save the current configurations with the save command, and then run
the reboot command to reset the system. If the fault impacts a small range, you can run the
schedule reboot command to reset the system in off hours such as the wee hours.
NOTE

If the system can be restarted through a software program, do not reset the system.

Step 6 Seek technical support.


For seeking Huawei technical support, see Technical Support.
----End

Issue 01 (2009-04-15)

Huawei Proprietary and Confidential


Copyright Huawei Technologies Co., Ltd.

3-13

Quidway S9300 Terabit Routing Switch


Emergency Maintenance

3 Emergency Maintenance for Service Faults

3.3.3 Failure to Forward MPLS VPN Packets


Fault Description
All or some MPLS VPN packets cannot be forwarded on a network.

Fault Information Collection


For the failure to forward MPLS VPN packets, collect the following information besides the
generic information described in Guide to Fault Information Collection for future reference.
NOTE

Before using the debugging command to collect debugging information, run the terminal debugging
command to enable the debugging display on a terminal, and then run the terminal monitor command
to enable the display on the terminal.

For ease of fault location, it is recommended to collect long-term debugging information.

After you collect debugging information, run the undo debugging all command to disable all the
debugging immediately.

Table 3-3 Collection of information about the failure to forward MPLS VPN packets
No.

Collecting Item

Collecting Method

Information about LDP and LSRs

display mpls ldp all

Information about all the interfaces


enabled with LDP

display mpls ldp interface

Information about the peer

display mpls ldp peer

Information about LDP sessions

display mpls ldp session

Information about LDP LSPs

display mpls ldp lsp

Information about LDP and LSRs of


the specified VPN instance

display mpls ldp vpn-instance vpninstance-name

The values of labels allocated by


BGP, LDP LSP or RSVP

display mpls lsp

Information about the VPN instance

display ip vpn-instance verbose

Information about the routing table


of the specified VPN instance

display ip routing-table vpn-instance vpninstance-name

10

All LDP debugging information

After you collect information by using the


debugging mpls ldp all command, disable
the debugging immediately.

NOTE
LDP = Label Distribution Protocol; LSR = Label Switching Router; LSP = Label Switching Path

3-14

Huawei Proprietary and Confidential


Copyright Huawei Technologies Co., Ltd.

Issue 01 (2009-04-15)

Quidway S9300 Terabit Routing Switch


Emergency Maintenance

3 Emergency Maintenance for Service Faults

Handling Flowchart
NOTE

First, check whether all or some MPLS VPN services are interrupted on a network. If some MPLS
VPN services are interrupted, the cause possibly lies in the incorrect setting of the maximum
transmission unit (MTU) of a certain device on the network. The protocol stack or application of some
servers does not minimize packet fragments. The length of a packet in VPN forwarding, however,
exceeds the default MTU 1500 after it is added with MPLS labels, each of which is of four bytes.
Therefore, the P that forwards MPLS packets must be set with an MTU greater than 1500 plus the label
length.

This section only describes the handling flowchart for the failure to forward all MPLS VPN packets.

Figure 3-4 Flowchart for handling the failure to forward MPLS VPN packets
Start

LDP sessions
are set up?

No

Yes
Restore LDP

Fault rectified?

Yes

No

LSPs are
set up?

No

Yes

Restore the IGP


configuration on the
public network

Fault rectified?
No

Yes

VPN instances
are correctly
configured?

No

Yes

Restore the IGP


configuration of
VPN instances

Fault rectified?
No

Yes

VPN forwarding
is normal?

No

Yes

Restore VPN
routers

Fault rectified?
No

Yes

Yes
Reset the system

Fault rectified?
No
Seek technical
support

Issue 01 (2009-04-15)

Huawei Proprietary and Confidential


Copyright Huawei Technologies Co., Ltd.

End

3-15

3 Emergency Maintenance for Service Faults

Quidway S9300 Terabit Routing Switch


Emergency Maintenance

CAUTION
All the following steps can be performed only when the user services are already interrupted. If
the user services are not interrupted, collect fault information and provide feedback to Huawei
engineers for further processing.

Procedure
Step 1 Check and restore LDP.
When MPLS VPN services are interrupted on a network, check whether the LDP sessions
between Provider Edges (PEs) are set up. If the LDP sessions are not set up, perform the
following:
1.

Run the display mpls ldp command to check whether the LSR IDs of different PEs conflict.
On a network, similar to a router ID, an LSR ID must be globally unique. If the LSR IDs
conflict, change the LSR IDs to keep each of them unique. Then check whether the LDP
sessions can be set up.

2.

If the LDP sessions cannot be set up, run the display mpls ldp peer command to check the
IP address of the peer.

3.

Run the ping -a source-ip command to check whether the peer address is reachable.

4.

If the peer cannot be pinged, run the display ip routing-table command to check whether
the route destined for the peer is reachable. Then, run the display fib command to check
whether the forwarding entry exists in the FIB table of the local end. If neither the route
nor the corresponding forwarding entry exists, check the link layer and network layer.

5.

If packets cannot be forwarded after the LDP sessions are set up, proceed to the following
step.

Step 2 Check and restore LSPs.


Run the display mpls ldp lsp command to check whether LSPs can be set up. If the LSPs are
not set up, perform the following:
1.

Check how to set up an LSP by LDP. By default, only the route to a local loopback interface
is assigned labels to set up an LSP. When all the routes to local interfaces besides the
loopback interface need to be assigned labels to set up LSPs, the lsp trigger all command
must be run for LDP.

2.

Check that the label mapping message is received from the source device of the route. Then
check whether the outbound interface and next hop of the route are those in the label
mapping message. If the outbound interface and next hop are different, check the Interior
Gateway Protocol (IGP) configuration for correctness or reset the IGP.

3.

If MPLS VPN packets still fail to be forwarded after the successful setup of LDP LSPs,
proceed to the following step.

Step 3 Check and restore the configuration of VPN instances.


On a network, the parameters of VPN instances on all the devices correlate with each other. To
effectively control correlation between routes, you can direct the route flooding if different
Route-Distinguishers are set.
3-16

Huawei Proprietary and Confidential


Copyright Huawei Technologies Co., Ltd.

Issue 01 (2009-04-15)

Quidway S9300 Terabit Routing Switch


Emergency Maintenance

3 Emergency Maintenance for Service Faults

1.

Check whether the Route-Distinguisher of each VPN instance is unique and the VPN target
configuration caters to the requirements of network planning. If the Route-Distinguisher
or VPN target does not meet requirements, re-configure them.

2.

Check whether interfaces join VPN instances. If the interfaces are not in the VPN instance,
re-bind the interfaces to the relevant VPN instances. Note that all IP-related configurations
on an interface are removed when the interface is bound to a VPN instance. Therefore, you
need to perform IP-related configuration again.

Step 4 Check and restore VPN routes.


Run the display bgp vpnv4 all routing-table command on a PE to check whether the routes to
other PEs or Customer Edges (CEs) are correct. If the routes are incorrect, perform the following:
1.

Check whether the Multicast Border Gateway Protocol (MBGP) neighbor relationships
between the PEs are set up. If the neighbor relationships are not set up, check whether the
IGP spreads the routes of a loopback interface to the peer. If the IGP does not spread routes
to the peer, modify the IGP configuration.

2.

Check whether the address family is created for each VPN instance in the BGP view and
the routes are imported to the BGP routing table according to the routing protocol between
PEs and CEs. Check whether the MBGP sessions between PEs use the loopback interfaces
for protocol connections. If the MBGP sessions do not use the loopback interfaces, cancel
the configuration and re-configure them.

3.

If static routes are configured between PEs and CEs, you need check whether the next hop
of a static route is directly connected. The next hop of a static route cannot be iterated.
Otherwise, delete the static routes and re-configure them.

4.

If the fault persists after you perform the preceding checking, reset the relevant IGP and
BGP. For example, reset all IS-IS connections through the reset isis all command and BGP
connections through the reset bgp command.

5.

If resetting the routing protocols is ineffective, proceed to the following step.

Step 5 Reset the system.


To solve a software problem, resetting the system is the last and most effective solution. If other
users are not affected, you can reset the system to solve the problem.
Before resetting a system, save the existing configurations with the save command, and then
run the reboot command to reset the system. If the fault impacts a small range, you can run the
schedule reboot command to reset the system in off hours such as the wee hours.
NOTE

If the system can be reset through a software program, do not power off the device.

Step 6 Seek technical support.


For seeking Huawei technical support, see Technical Support.
----End

Issue 01 (2009-04-15)

Huawei Proprietary and Confidential


Copyright Huawei Technologies Co., Ltd.

3-17

Quidway S9300 Terabit Routing Switch


Emergency Maintenance

4 Guide to Fault Information Collection

Guide to Fault Information Collection

About This Chapter


This chapter describes how to collect basic and device faults information.
4.1 Overview
This section describes the collection and classification of fault information.
4.2 Collection of Basic Fault Information
This section describes the collecting items and methods of basic fault information.
4.3 Collection of Device Fault Information
This section describes the collecting items and methods of device fault information.

Issue 01 (2009-04-15)

Huawei Proprietary and Confidential


Copyright Huawei Technologies Co., Ltd.

4-1

Quidway S9300 Terabit Routing Switch


Emergency Maintenance

4 Guide to Fault Information Collection

4.1 Overview
This section describes the collection and classification of fault information.
After an emergency occurs, collect and back up fault information on time for reference. In
addition, provide fault information to Huawei engineers for fault location and rectification.
When a fault occurs, collect the following information:
l

Basic fault information

Device fault information

4.2 Collection of Basic Fault Information


This section describes the collecting items and methods of basic fault information.
Table 4-1 lists the basic information to be collected when a fault occurs.
Table 4-1 Collection of basic fault information
No.

Collecting Item

Collecting Method

Fault duration

Maintain a record of the fault duration, achieving the


accuracy of minutes.

Fault description

Collect the fault symptoms and maintain detailed records.

Fault severity level

On the basis of the range that the fault spreads to and the
severity of the fault, record the fault severity level according
to the level definition.

Software version

Collect information about the software version on the console


through the display version command when you can log in
to the device through Telnet or the console interface.

Networking
information

Provide a networking diagram in which the upstream and


downstream devices and the peer interfaces are shown.

Taken measures

Maintain records of the measures that have been taken and


the results.

NOTE

When you collect fault information through command lines, you can copy the information displayed on
the console, including the COM port or the Telnet terminal, and then attach it to a txt file for a record.

4.3 Collection of Device Fault Information


This section describes the collecting items and methods of device fault information.
When a fault occurs, collect information described in Table 4-2 if you can log in to the device
through Telnet or the console interface.
4-2

Huawei Proprietary and Confidential


Copyright Huawei Technologies Co., Ltd.

Issue 01 (2009-04-15)

Quidway S9300 Terabit Routing Switch


Emergency Maintenance

4 Guide to Fault Information Collection

Table 4-2 Collection of device fault information

Issue 01 (2009-04-15)

No.

Collecting Item

Collecting Method

Device information

Run the display device command.

Temperature

Run the display temperature command.

CPU usage

Run the display cpu-usage command.

Routing table
information

Run the display ip routing-table command.

Logs

Run the display logbuffer command.

Traps

Run the display trapbuffer command.

Configuration

Run the display current-configuration command.

Diagnostic information
about the device

Run the display diagnostic-information command.

Interface information

Run the display interface command.

10

Network connectivity
information

Run the ping command to collect information about the


network connectivity and record the results.

Huawei Proprietary and Confidential


Copyright Huawei Technologies Co., Ltd.

4-3

Quidway S9300 Terabit Routing Switch


Emergency Maintenance

5 Guide to System Reboot

Guide to System Reboot

About This Chapter


This chapter describes the preparation, guide, and verification of system reboot, and describes
how to handle the faults.
5.1 Overview
This section describes the applicable environment and precautions for system reboot.
5.2 Preparation for System Reboot
This section describes the preparation before reboot a system.
5.3 Guide to System Reboot
This section describes how to reboot a system.
5.4 Verification of System Reboot
This section describes how to verify system reboot.
5.5 Handling of a System Reboot Failure
This section describes how to handle the failure to reboot a system.

Issue 01 (2009-04-15)

Huawei Proprietary and Confidential


Copyright Huawei Technologies Co., Ltd.

5-1

Quidway S9300 Terabit Routing Switch


Emergency Maintenance

5 Guide to System Reboot

5.1 Overview
This section describes the applicable environment and precautions for system reboot.

CAUTION
Do not reboot an S9300 randomly. If necessarily required, learn the guidelines and precautions
described in Overview of Emergency Maintenance or restart the system with the guidance of
Huawei engineers.
During the S9300 reboot, all services on the device should be interrupted except in the dualsystem hot backup networking. The services are resumed after the S9300 is rebooted
successfully.
An S9300 automatically reboots when an excessively severe fault occurs on it. After the
automatic reboot, the system begins to run normally. Therefore, you do not need to reboot a
system manually.
Rebooting an S9300 is applicable to only an emergency or an exception. For example, if an
S9300 fails to automatically reboot when services on it are interrupted and other taken measures
are ineffective, you can reboot it manually.

5.2 Preparation for System Reboot


This section describes the preparation before reboot a system.
Before rebooting an S9300, ensure that its configuration files are backed up. Configuration files
should be backed up and executed automatically after reboot. The services can thus be resumed.
For the backup and restoration of configuration files, refer to Maintenance Guidelines in the
Quidway S9300Terabit Routing Switch Routine Maintenance.

5.3 Guide to System Reboot


This section describes how to reboot a system.

CAUTION
Do not remove the LPU or a flexible plug-in card of the SRU in service. The boards of other
types are hot pluggable.
The S9300 can be manually rebooted in either of the following ways:

5-2

Running command lines

Pressing the RESET button on the MCU/SRU

Switching the system off, and then on


Huawei Proprietary and Confidential
Copyright Huawei Technologies Co., Ltd.

Issue 01 (2009-04-15)

Quidway S9300 Terabit Routing Switch


Emergency Maintenance

5 Guide to System Reboot

Operating through the NMS

NOTE

It is not recommended to reboot an S9300 remotely; otherwise, the reboot may fail and services are
interrupted for a long period.

5.3.1 Running Command Lines


5.3.2 Pressing the RESET Button on the MCU/SRU
5.3.3 Switching Off and Switching On the System
5.3.4 Operating Through the NMS

5.3.1 Running Command Lines


When a manual reboot of an S9300 is required, it is recommended to reboot the system by
running command lines on the configuration terminal. You can reboot an S9300 by using either
of the following commands:

reboot Command
Enter the reboot command in the user view and press Y after the display. Then the system
reboots. The operation example is as follows:
<Quidway> reboot
Info:The system is now comparing the configuration, please wait.
Info:Save current configuration?[Y/N]:y
Now saving the current configuration to the device
Info:The current configuration was saved to the masterboard device successfully.
System will reboot! Continue?[Y/N]:
NOTE

After you run the reboot command, the displays maybe vary with different system versions.

schedule reboot Command


The schedule reboot command is of the two types in the user view: schedule reboot delay and
schedule reboot at.
l

Running the schedule reboot delay command, you can enable the scheduled reboot
function and set the wait delay.

You can set the wait delay for the scheduled reboot function in two formats: "hour:minute"
and "absolute minutes". The total minutes cannot be more than 30 x 24 x 60 minutes.

Running the schedule reboot at command, you can enable the scheduled reboot function
and specify the reboot date and time. Note that the specified date cannot be 30 days later
than the current date.
If the schedule reboot at command specifies the date parameter (yyyy/mm/dd) and the
date is a later date, the S9300 will reboot at a specified time with the error no more than
one minute.
If no specific date is set, the following situations occur:

Issue 01 (2009-04-15)

If the set time is later than the current time, the S9300 reboots at this time that day.

If the set time is before the current time, the S9300 reboots at this time the next day.
Huawei Proprietary and Confidential
Copyright Huawei Technologies Co., Ltd.

5-3

Quidway S9300 Terabit Routing Switch


Emergency Maintenance

5 Guide to System Reboot

After the schedule reboot delay or schedule reboot at command is run, the system prompts
you to confirm the reboot. Enter Y or y, and the configuration takes effect. If the related
configuration exists, the latest configuration overrides the previous one.
NOTE

If you adjust the system time through the clock command after running the schedule reboot delay or
schedule reboot at command, the previous configuration through the schedule reboot delay or schedule
reboot at command becomes invalid.

You can run the undo schedule reboot command to remove the configuration through the
schedule reboot delay or schedule reboot at command.
You can run the display schedule reboot command to view the configuration through the
schedule reboot delay or schedule reboot at command.

5.3.2 Pressing the RESET Button on the MCU/SRU


When command lines cannot be run on the S9300 or the configuration terminal cannot control
the S9300 , the S9300 should be rebooted manually. In this case, you can press the RESET
button on the front panel of the MCU/SRU.
The main control board of an S9312 or S9306 is the SRU.
The main control board of an S9303 is the MCU.

5.3.3 Switching Off and Switching On the System

CAUTION
It is recommended to reboot an S9300 in this mode only when a critical fault occurs in the power
supply system of the equipment room and the S9300, therefore, is powered off. In this case,
switch the S9300 off, and then switch it on again when the power supply system returns to
normal.
The S9300 chassis can hold two AC or DC power modules, which do not support intermixing.
It is recommended to install an active power module and a standby power module in the chassis,
which work in 1+1 load balancing mode.
The switch of the power module is located on the front panel of the power module. Turn the
switch of the power module point to OFF to turn off the power; turn the switch of the power
module point to ON to turn on the power.
When an S9300 uses two power modules for load balancing, you need to switch off both the
power modules to turn off the power and switch on both the power modules to turn on the power.

5.3.4 Operating Through the NMS


Context
The procedures for rebooting the S9300 through the iManager N2000 NMS are as follows:
5-4

Huawei Proprietary and Confidential


Copyright Huawei Technologies Co., Ltd.

Issue 01 (2009-04-15)

Quidway S9300 Terabit Routing Switch


Emergency Maintenance

5 Guide to System Reboot

Procedure
Step 1 In the topology navigation tree or the topology view, select the S9300 to be operated and rightclick it.
Step 2 Choose Device Management > Reboot Device on the shortcut menu.
Step 3 Click Yes in the displayed dialog box.
----End

Postrequisite
When the S9300 is rebooted, its node icon becomes unavailable.
After the reboot is successful, the node icon becomes green.
For detailed operations, refer to the NMS Online Help.

5.4 Verification of System Reboot


This section describes how to verify system reboot.

CAUTION
After an S9300 is rebooted, check that the configuration data is recovered correctly and
completely in case services are interrupted owing to failed recovery of configuration data. If
some configuration data is lost, add it manually and save it.
5.4.1 Displaying Information About System Reboot
5.4.2 Checking the Software Version and Configuration File

5.4.1 Displaying Information About System Reboot


Take the reboot of an S9312 that you log in to through the console interface and reboot it by
using the reboot command as an example. After the S9312 is rebooted, the display on the
configuration terminal is as follows:
Boardname ..................................................................SRU
Start L2 Cache Test ? ('t' is test):skip
Bootbus init.................................................................OK
DDR DRAM init................................................................OK
Memory Data Bus Walk '0' Test .............................................pass
Memory Data Bus Walk '1' Test .............................................Pass
Memory Address Bus Walk '0' Test ..........................................Pass
Memory Address Bus Walk '1' Test ..........................................Pass
Start Memory Unit Test ? ('t' or 'T' is test):skip
Copying Uncompressed Data from Rom to Ram .................................Done

Issue 01 (2009-04-15)

Huawei Proprietary and Confidential


Copyright Huawei Technologies Co., Ltd.

5-5

Quidway S9300 Terabit Routing Switch


Emergency Maintenance

5 Guide to System Reboot

Uncompressing Data from Rom to RAM ........................................Done


Initializing Flash Module .................................................Done
...
Starting...
Starting at 0x6c00000...
****************************************************
*
*
*
S9300 Bootrom, Ver 003
*
*
*
****************************************************
Copyright(C) 2008-2026 by HUAWEI TECHNOLOGIES CO., LTD.
Creation date: Dec 26 2008, 16:29:56
Board Type
CPU type
CPU L2 Cache
CPU Clock Speed
BUS Clock Speed
Memory Type
Memory Size
Memory Speed

:
:
:
:
:
:
:
:

LE02SRUA
Cavium Octeon
128KB
700MHz
133MHz
DDR2 SDRAM
512MB
667MHz

...
Recover configuration...OK!
Press ENTER to get started.

The preceding display shows that the S9312 is rebooted successfully. Press Enter and enter the
user view.

5.4.2 Checking the Software Version and Configuration File


You need to check the software version and configuration files after an S9300 is restarted.

Checking the Software Version


Run the display version command to check that the software version is correct. For example:
<Quidway> display version
Huawei Versatile Routing Platform Software
VRP (R) software, Version 5.50 (S9300 V100R001C02B112)
Copyright (C) 2003-2010 HUAWEI TECH CO., LTD
Quidway S9312 Terabit Routing Switch uptime is 0 week, 4 days, 6 hours, 7 minute
s
BKP 0 version information:
1. PCB
Version : LE01BAKA VER.A
2. Board Type
: BAKA
3. MPU Slot Quantity: 2
4. LPU Slot Quantity: 12
SRU 13(Master) : uptime is 0 week, 4 days, 6 hours, 7 minutes
StartupTime 2009/02/01
08:45:09
SDRAM Memory Size
: 512
M bytes
Flash Memory Size
: 64
M bytes
NVRAM Memory Size
: 512
K bytes
CF Card1 Memory Size : 494
M bytes
MPU version information :
1. PCB
Version : LE02SRUA VER.A
2. MAB
Version : 0
3. Board
Type
: SRUA
4. CPLD1
Version : 8120521
5. BootROM Version : 3

5-6

Huawei Proprietary and Confidential


Copyright Huawei Technologies Co., Ltd.

Issue 01 (2009-04-15)

Quidway S9300 Terabit Routing Switch


Emergency Maintenance
6. BootLoad Version

5 Guide to System Reboot


: 3

LPU 9 : uptime is 0 week, 3 days, 21 hours, 54 minutes


StartupTime 2009/02/01
16:58:32
SDRAM Memory Size
: 128
M bytes
Flash Memory Size
: 8
M bytes
LPU version information :
1. PCB
Version : LE02G24C VER.A
2. MAB
Version : 0
3. Board
Type
: G24CA
4. CPLD1
Version : 8120410
5. BootROM Version : 3
6. BootLoad Version : 3

The preceding information displays the Versatile Routing Platform (VRP) version, host version,
and patch version. You can check that the version numbers are the same as those before system
reboot.

Checking the Configuration Files


Run the display startup command to check that the configuration files are correct. For example:
<Quidway> display startup
MainBoard:
Configed startup system software:
Startup system software:
Next startup system software:
Startup saved-configuration file:
Next startup saved-configuration file:
Startup patch package:
Next startup patch package:

cfcard:/s9300v100r001c02b112.cc
cfcard:/s9300v100r001c02b112.cc
cfcard:/s9300v100r001c02b112.cc
cfcard:/new.cfg
cfcard:/new.cfg
cfcard:/c02b112sph001.pat
cfcard:/c02b112sph001.pat

The preceding information displays the name of the current startup file, the name of the current
configuration file, and the patch package loaded when startup.

5.5 Handling of a System Reboot Failure


This section describes how to handle the failure to reboot a system.
During the S9300 reboot, if any problem arises, contact the local Huawei engineers.

Issue 01 (2009-04-15)

Huawei Proprietary and Confidential


Copyright Huawei Technologies Co., Ltd.

5-7

Quidway S9300 Terabit Routing Switch


Emergency Maintenance

6 Emergency Maintenance Record Table

Emergency Maintenance Record Table

About This Chapter


This chapter describes how to fill up the emergency maintenance record table.
6.1 Notice of Emergency Maintenance
6.2 Emergency Record Table

Issue 01 (2009-04-15)

Huawei Proprietary and Confidential


Copyright Huawei Technologies Co., Ltd.

6-1

Quidway S9300 Terabit Routing Switch


Emergency Maintenance

6 Emergency Maintenance Record Table

6.1 Notice of Emergency Maintenance


If the maintenance personnel cannot remove a fault through emergency maintenance, they can
make a call or send a fax to obtain technical support from Huawei. The maintenance personnel
need maintain a detailed record of the emergency maintenance procedures, notify Huawei of the
type of the board to be replaced, and apply for a spare one according to the warranty articles.
The fault can thus be removed sooner.
The following format can be used in the fax.
Table 6-1 Notice of emergency maintenance
Filled by the customer
Telecom
office

Device model

Capacity

Customer

Phone No.

Version

Complaint
time

Required
response time

In warranty

Yes No

Fault description and handling procedure (in detail): Approved by: Signature:
Filled by Huawei engineers
Handling
method

Guide in call Remote maintenance On-site support

Result (attachment) Handled by: Date:


Remaining problems:

6.2 Emergency Record Table


Emergency record table
Telecom office: __________________ Date: _MM/DD/YY_________________
Person on duty
Fault source

Handler
Customer
complaint

Basic information:

Routine
maintenance
Alarms
Other sources
Fault symptom:
Solution and result:
6-2

Huawei Proprietary and Confidential


Copyright Huawei Technologies Co., Ltd.

Issue 01 (2009-04-15)

Quidway S9300 Terabit Routing Switch


Emergency Maintenance

7 System Upgrading Through BIOS

System Upgrading Through BIOS

This chapter describes how to upgrade system through BIOS.

Context

CAUTION
The process of upgrading the system through the Basic Input/Output System (BIOS) is
complicated and this method is not recommended. The BIOS is required only when the host
program of the S9300 cannot be started.
The BIOS can be used on only the FTP client. The operation terminal must be connected to the
S9300 through the COM port and communicate with the S9300 through the HyperTerminal.
NOTE

Take the S9312 upgrading procedure as an example here. The upgrading procedure of S9303 and S9306
is the same as the upgrading procedure of S9312.

Procedure
Step 1 Run FTP on the configuration terminal or PC to specify the path of system files. Create an FTP
user named 9300 and set the password as 9300.
Step 2 Reboot the S9312.
When the S9312 is powered on, the PC or terminal used to set up the configuration environment
displays the following:
input 'm' to Select Debug Console:
Boardname ..................................................................SRU
Start L2 Cache Test ? ('t' is test):skip
BIOS Creation Date ....................................... Feb 2 2009 14:48:10
Bootbus init.................................................................OK
DDR DRAM init................................................................OK
Memory Data Bus Walk '0' Test .............................................Pass
Memory Data Bus Walk '1' Test .............................................Pass

Issue 01 (2009-04-15)

Huawei Proprietary and Confidential


Copyright Huawei Technologies Co., Ltd.

7-1

Quidway S9300 Terabit Routing Switch


Emergency Maintenance

7 System Upgrading Through BIOS

Memory Address Bus Walk '0' Test ..........................................Pass


Memory Address Bus Walk '1' Test ..........................................Pass
Start Memory Unit Test ? ('t' or 'T' is test):skip
Copying Uncompressed Data from Rom to Ram .................................Done
Uncompressing Data from Rom to RAM ........................................Done
Initializing Flash Module .................................................Done
Starting...

The S9312 is starting the basic BootROM menu. Then, the S9312 starts the extended BootROM
menu.
If a fault is caused by detection or other reasons, the system displays the basic BootROM menu.
You can also press Ctrl+A within two seconds to enter the basic BootROM menu. Otherwise,
the system automatically initiates the extended BootROM menu.
The basic BootROM is used to upgrade the basic BootROM and the extended BootROM. For
details, see the following description.
Update BIOS menu(ver004)
Creation date: Feb 2 2009 14:48:04
1.
2.
3.
4.
5.

Update base BIOS through serial interface


Update extend BIOS through serial interface
Modify serial interface parameter
Boot extend BOIS system
Reboot

Enter your choice(1-5):


NOTE

To upgrade the BootRom, you need to change the baud rate, and then download the files. After the
upgrade, restore the default connection rate to 9600 bit/s; otherwise, information may not be displayed
when you start or restart the system.

After you select 4, the system copies the extended BootROM to the SDRAM, and then decompresses
and starts the extended BootROM. After startup, the system starts the extended BootROM menu.

The extended BootROM menu is as follows:


****************************************************
*
*
*
S9300 Bootrom, Ver 256
*
*
*
****************************************************
Copyright(C) 2008-2026 by HUAWEI TECHNOLOGIES CO., LTD.
Creation date: Feb 2 2009, 14:49:25
PCB Version
CPU type
CPU L2 Cache
CPU Clock Speed
BUS Clock Speed
Memory Type
Memory Size
Memory Speed

:
:
:
:
:
:
:
:

LE02SRUA VER.A
Cavium Octeon
128KB
700MHz
133MHz
DDR2 SDRAM
512MB
667MHz

CF Card Init....Done

7-2

Huawei Proprietary and Confidential


Copyright Huawei Technologies Co., Ltd.

Issue 01 (2009-04-15)

Quidway S9300 Terabit Routing Switch


Emergency Maintenance

7 System Upgrading Through BIOS

Press Ctrl+B to enter Boot Menu... 3

Step 3 Press Ctrl+B within three seconds.


password:

Step 4 Enter the password for the main BootROM menu.


NOTE

The initial password is 7800, which can be changed. If three wrong passwords are entered consecutively,
the system restarts.

When a correct password is entered, the following BootROM menu is displayed:


1.
2.
3.
4.
5.
6.
7.
8.

MAIN MENU
Boot with default mode
Boot from Flash
Boot from CFCard
Enter serial submenu
Enter ethernet submenu
Modify Flash description area
Modify bootrom password
Reboot

Enter your choice(1-8): _

Step 5 Press 5 to display the Ethernet submenu.


ETHERNET
1.
2.
3.
4.
5.

SUBMENU

Download file to SDRAM through ethernet interface and boot


Download file to Flash through ethernet interface
Download file to CFCard through ethernet interface
Modify ethernet interface parameter
Return to main menu

Be sure to modify parameter before downloading!


Enter your choice(1-5):

Step 6 Press 4 to set the Ethernet interface parameters.


Note: Two protocols for download, tftp & ftp.
You can modify the flags following the menu.
tftp--0x80, ftp--0x0.
'.' = clear field;

'-' = go to previous field;

^D = quit

boot device
: eth1
processor number
: 0
host name
: host
file name
: s9300.cc
# Name of the software program to be
loaded
inet on ethernet (e) : 192.168.1.1:ffffff00
inet on backplane (b): 192.168.1.1
host inet (h)
: 192.168.1.2
# IP address of the FTP server
gateway inet (g)
:
user (u)
: 9300
# FTP user name
ftp password (pw) (blank = use rsh): 9300 # FTP login password
flags (f)
: 0x0
target name (tn)
: octeon
startup script (s)
:
other (o)
:

The preceding information shows that the name of the software program to be loaded is s9300.cc,
the IP address of the FTP server is 192.168.1.2, the FTP user name is 9300, and the password
is 9300. Modify the preceding parameters according to the actual situation. The other parameters
do not need to be modified.
Issue 01 (2009-04-15)

Huawei Proprietary and Confidential


Copyright Huawei Technologies Co., Ltd.

7-3

Quidway S9300 Terabit Routing Switch


Emergency Maintenance

7 System Upgrading Through BIOS

Step 7 Press 3 to download the program to the CF card.


Be sure to modify parameter before downloading!
Enter your choice(1-5):
boot device
:
unit number
:
processor number
:
host name
:
file name
:
inet on ethernet (e) :
inet on backplane (b):
host inet (h)
:
gateway inet (g)
:
user (u)
:
ftp password (pw)
:
flags (f)
:
target name (tn)
:

eth
1
0
host
s9300.cc
10.164.44.119
192.168.1.1:ffffff00
10.164.19.46
10.164.44.1
9300
9300
0x0
octeon

Loading...................................Done!
Please type a new file name for saving it.
Press return key to save it named "s9300.cc".
Check disk space
Writing
file..............................................................................
...........................................Done

Step 8 Press 5 to return to the main menu.


ETHERNET
1.
2.
3.
4.
5.

SUBMENU

Download file to SDRAM through ethernet interface and boot


Download file to Flash through ethernet interface
Download file to CFCard through ethernet interface
Modify ethernet interface parameter
Return to main menu

Be sure to modify parameter before downloading!


Enter your choice(1-5):

Step 9 Press 6 to change the startup file to the new program.


1.
2.
3.
4.
5.
6.
7.
8.

MAIN MENU
Boot with default mode
Boot from Flash
Boot from CFCard
Enter serial submenu
Enter ethernet submenu
Modify Flash description area
Modify bootrom password
Reboot

Enter your choice(1-8):


Modify flash description area
Please select booting device.
Press ENTER directly for no change or input your choice.
1: Flash, 2: CF Card
Current booting device: 2, your choice: 2
# Start from the CF card
Current booting File Name: cfcard:/s9300.cc,
Press ENTER directly for no change.
Or, please input the file name (e.g. s9300.cc):
new upgrade program
The expected booting file: cfcard:/s9300.cc.
Are you sure? Yes or No(Y/N)y

7-4

^s9300.cc

# Enter the name of the

# Enter y to confirm the change

Huawei Proprietary and Confidential


Copyright Huawei Technologies Co., Ltd.

Issue 01 (2009-04-15)

Quidway S9300 Terabit Routing Switch


Emergency Maintenance

7 System Upgrading Through BIOS

Writting descriptor to flash...OK!


Writting backup descriptor to flash...OK!

Step 10 Press 8 to reboot the S9312.


----End

Issue 01 (2009-04-15)

Huawei Proprietary and Confidential


Copyright Huawei Technologies Co., Ltd.

7-5

You might also like