You are on page 1of 93

Date 15th Oct 2016

Trained by Vinay Nyalakonda


GSS.IER.Unix.L1
1

Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written
Cloud Introduction

Page 1

Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 2
Cloud deployment models
Introduction to SLES system data and crash dump analysis:-
Overview on the tools used in SLES supportconfig data
collection including their installation and execution.
Overview on the crash dump analysis on SLES including
their installation and execution.

Introduction to Oracle Solaris explorer and crash analysis:-


Overview on the tools used in Oracle Solaris explorer
data collection including their installation and execution.
Overview on the crash dump analysis on Oracle
Solaris including
their installation and execution.

Page 2

Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 3
RHEL6/7 SOS Report Analysis
What is a sosreport?
The sosreport command is a tool that collects configuration and diagnostic
information from a Red Hat Enterprise Linux system. For instance: the running
kernel version, loaded modules, and system and service configuration files. The
sosreport is a tool to collect troubleshooting data on RHEL/CentOS systems. It
generates a compressed tarball of debugging information that gives an overview of
the most important logs and configuration of a Linux system, to be sent to Redhat
Support. Among other things, the sosreport includes information about the installed
rpm versions, syslog, network configuration, mounted filesystems, disk partition
details, loaded kernel modules and status of all services.

How to install/generate a sosreport?


To run sosreport, the package sos must be installed. This is usually installed by
default, unless the system was installed with a custom package set. If it is not
installed, it can be installed from the yum repository using yum install sos
command as root.
-
Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 4
RHEL6/7 SOS Report Analysis
Some of the SOS RPMs:-

Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 5
RHEL6/7 SOS Report Analysis
Example, sos-3.0-23.el7.noarch.rpm
Provides
sos
config(sos)

Requires
/usr/bin/python
bzip2
config(sos) = 3.0-23.el7
libxml2-python
python(abi) = 2.7
rpm-python
rpmlib(CompressedFileNames) <= 3.0.4-1
rpmlib(FileDigests) <= 4.6.0-1
rpmlib(PartialHardlinkSets) <= 4.0.4-1
rpmlib(PayloadFilesHavePrefix) <= 4.0-1
tar
xz
rpmlib(PayloadIsXz) <= 5.2-1

License
GPLv2+

Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 6
RHEL6/7 SOS Report Analysis
Generating SOS report:-
To create the sosreport can be as simple as running the command in a terminal, without arguments, as root:

# sosreport

It will ask for some information related to a support case:

# sosreport

sosreport (version 2.2)

This utility will collect some detailed information about the hardware and setup of your Red Hat Enterprise Linux system.
The information is collected and an archive is packaged under /tmp, which you can send to a support representative.
Red Hat Enterprise Linux will use this information for diagnostic purposes ONLY and it will be considered
confidential information.

This process may take a while to complete.


No changes will be made to your system.

Continued

Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 7
RHEL6/7 SOS Report Analysis
Press ENTER to continue, or CTRL-C to quit.

Please enter your first initial and last name [geeklab]: Sandeep
Please enter the case number that you are generating this report for [None]:

On completion, a compressed tarball will be created in /tmp, along with a file containing the md5sum so that the files
integrity can be verified by the support representative. The filename will be printed to the terminal:

Creating compressed archive...


Your sosreport has been generated and saved in:
/tmp/sosreport-Sandeep-20151011150306-c847.tar.xz

The md5sum is: ef729c471178c87582ae422290c1c847

Please send this file to your support representative in


Redhat Support.

It is possible to have the sosreport created somewhere other than /tmp by setting the TMPDIR environment variable
when running the sosreport command:

# TMPDIR=/home/jdoe sosreport
Continued

Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 8
RHEL6/7 SOS Report Analysis
Additional options
To list available plugins in sosreport: The sosreport command has a modular structure and allows the user to enable and
disable modules and specify module options via the command line. To list available modules (plug-ins) use the following
command:
# sosreport -l
sosreport (version 2.2)
The following plugins are currently enabled:
acpid acpid related information
anaconda Anaconda / Installation information
auditd Auditd related information
autofs autofs server-related information
bootloader Bootloader information
cgroups cgroup subsystem information
crontab Crontab information
devicemapper device-mapper related information (dm, lvm, multipath)
dovecot dovecot server related information
filesys information on filesystems
networking network setup related
information etc.............
To temporarily turn off a module:
Include it in a comma-separated list of modules passed to the -n/--skip-plugins option.
For instance, to temporarily disable both the kvm and amd modules (if broken): # sosreport -n
kvm,amd Continued

Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 9
RHEL6/7 SOS Report Analysis
If the system has a lot of packages installed, and sosreport takes a long time to complete, support may request that
you disable the rpm database verification (verifies all packaged files on the filesystem against the rpm database)

# sosreport -k rpm.rpmva=off

Some of the guidelines provided for SOS report analysis by Red Hat Support:-

The sosreport command will normally complete within a few minutes on Red Hat Enterprise Linux 6. Older versions may
take longer to complete. Depending on local configuration and the options specified in some cases the command may take
longer to finish. If you are concerned about the run time of the sosreport command contact your Red Hat support
representative for assistance.

Once completed, sosreport will generate a compressed a file under /tmp (for RHEL6 and earlier) or under /var/tmp (for
RHEL7 and later). Different versions use different compression schemes (gz, bz2, or xz). The file should be provided to your
support representative (normally as an attachment to an open case).

The size of the archive varies depending on system configuration and any optional sosreport features that are enabled (for
example specifying the "all_logs" option of the general module to collect all syslog log files may greatly increase the size of
the archive).

If the collected sosreport is too big to upload to the case, it could be uploaded to Red Hat ftp site at dropbox.redhat.com.

Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 10
RHEL6/7 SOS Report Analysis
What to do if sosreport hangs?
If sosreport fails due to "No space left on device" for the device
Verify there is enough space on the filesystem containing /tmp
# df -h /tmp
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/vg_ren-lv_root
50G 49G 1M 99% /
Free additional space, or a different directory can be
specified using the argument below
# sosreport --tmp-dir /path/to/another/volume

The sosreport may have hung because of a specific


plugin Try to determine which plugin it is hanging on.
1. Increase verbosity sosreport -vvvv
2. Strace it

Once you determine which plugin it is hanging on, exclude


it. Enter the following command to show all plugins:
Sosreport l
e.g. To exclude the filesys plugin: # sosreport -n filesys

Continued
Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 11
RHEL6/7 SOS Report Analysis
Otherwise, instead of collecting the failure sosreport, a manual report may be created by running the following script. Note
this will only be a portion of what is collected in a standard sosreport.

#!/bin/bash host="$
(hostname)"

sos_dir="/tmp/${host}_hungsos"
mkdir $sos_dir
cd $sos_dir
chkconfig --list > chkconfig
date > date
df > df
dmesg > dmesg
dmidecode > dmidecode
fdisk -l > fdisk
free > free
hostname --fqdn > hostname
ifconfig > ifconfig
lsmod > lsmod
lspci > lspci
cat
/proc/mounts
> mount
netstat -tlpn >
netstat
ps auxww > ps Continued
Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 12
RHEL6/7 SOS Report Analysis
rpm -qa > rpm-qa
rpm -Va > rpm-Va #this command may take a while to run
ulimit -a > ulimit
uname -a > uname
uptime > uptime
cat /proc/meminfo > meminfo
cat /proc/cpuinfo > cpuinfo
mkdir etc
cd etc
cp /etc/fstab .
cp /etc/cluster/cluster.conf .
cp /etc/security/limits.conf .
cp /etc/redhat-release .
cp /etc/sysctl.conf .
cp /etc/modprobe.conf .
mkdir sysconfig/network-
scripts -p
cd sysconfig
cp /etc/sysconfig/* . -R
cd $sos_dir
mkdir var/log -p
cp /var/log/message*
var/log -R
tar cvzf
/tmp/`hostname`-
partial-sos.tar.gz
/tmp/<sos temporary Continued
directory>
Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 13
RHEL6/7 SOS Report Analysis
Collecting sosreport from rescue mode:-
If the system does not boot, sosreport can be collected from rescue environment for troubleshooting purpose.
Issue
How to run sosreport in rescue mode?
How to generate sosreport for system that can not boot?
Red Hat Enterprise Linux system does not boot.
How to collect system information and logs for Technical Support to troubleshoot?
System hung or had a kernel panic and now it hangs or gives me an error on reboot, how can I create a sosreport?
Resolution
To generate the sosreport output from the rescue environment, boot the system with the installation disc of
the corresponding version of Red Hat Enterprise Linux and follow this procedure:
1. Enter linux rescue into the boot prompt.
NOTE: In RHEL6 press [Tab] to get to the boot prompt, then append the below text at the end of the vmlinuz line:
linux rescue
NOTE: If the system is multipathed, the following text should be used
instead: linux rescue mpath
NOTE: In RHEL7, Select "Troubleshooting", then "Rescue a Red Hat
Enterprise Linux system".
2. Once the rescue environment finishes booting, choose a language to use,
if prompted.
3. Choose a keyboard layout to use, if prompted.
4. Wait for network interfaces to be located, and activate them, so that
requested data can be transferred to another host.
Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 14
RHEL6/7 SOS Report Analysis
5.The rescue environment will try to find the current Red Hat Enterprise Linux installation on
the system. You will be prompted with the following options:
"Continue": continue mounting all of its partitions under /mnt/sysimage/ in Read & Write
mode
"Read-Only": continue mounting all of its partitions under /mnt/sysimage/ in Read Only
mode
"Skip": skip the mounting of the discovered Red Hat Enterprise Linux installation
and
proceed with manual mounting
NOTE: Select "Continue".
If you select "skip", you will have to manually mount your filesystem before performing the
next step
6. Run the following commands to continue generating sosreport:
# chroot /mnt/sysimage
# sosreport
The sosreport command can take some time to generate a report. It collects a significant
amount of information that may help Red Hat technicians resolve your issue.

Continued
Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 15
RHEL6/7 SOS Report Analysis
7. Once sosreport generation completes, it will provide the output in the /tmp directory while
the user is in the chroot environment. Exit the chroot environment and locate the sosreport
generated in the /mnt/sysimage/tmp directory.
Warning: During the running of the command sosreport you will be prompted for your name
and case number. Use only letters and/or numbers when filling out this field. Adding other
characters could damage the system or render the report unusable.
Note: While rescue mode will attempt to bind the mount points to /mnt/sysimage
sometimes
this fails and the following error is seen when attempting to run the sosreport:
error on sosreport: no such file or directory /dev/urandom
If this error is seen type the following commands to first exit the chroot'ed environment and
mount the necessary data:
# exit <-- this will exit the chroot'ed environment
# mount -o bind /dev /mnt/sysimage/dev
# mount -o bind /sys /mnt/sysimage/sys
# mount -o bind /proc /mnt/sysimage/proc
# chroot /mnt/sysimage
and then run the sosreport as stated above.
Continued
Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 16
RHEL6/7 SOS Report Analysis
Generate sosreport to an alternative location:-
If there is any space constraints in /tmp, it is possible to force sosreport to an alternative
location.
Use "--tmp-dir" option.
1. # mkdir /root/sos
2. # sosreport --tmp-dir /root/sos
It will generate the sosreport under /root/sos instead of /tmp.

Collecting sosreport manually:-


If all else fails and it is not possible to generate sosreport at all, the following solution goes
over collecting the files manually:

Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 17
RHEL6/7 SOS Report Analysis
Issue
Sosreport hangs.
Sosreport will not run to completion on system.
Sosreport command is not working.
How to create a sosreport manually.
Need to provide the information related to the system without sosreport.
How do I collect data if sosreport is not installed?
I'm attempting to run a sosreport of the server, but it appears to be hanging.
How to collect the required logs from the server without running sosreport
I want to capture the system configuratiom periodically without sosreport.

Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 18
RHEL6/7 SOS Report Analysis
Resolution:-
Some of the data that is collected by sosreport can alternatively be collected using following script, it may
take up to 5-10 minutes to run, depending on the size of the logs.
1 . Create a file named /tmp/man_sosreport.sh, copy the script given below and save it in the
/tmp/man_sosreport.sh file.
2 . Execute "chmod +rwx /tmp/man_sosreport.sh".

#!/bin/bash -x
export LANG=C
set -o verbose
mkdir /tmp/sosreport && cd /tmp/sosreport && mkdir -p etc/lvm etc/sysconfig network
storage sos_commands/networking

#System
hostname > hostname
cp -a /etc/redhat-release ./etc/
uptime > uptime
Continued

Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 19
RHEL6/7 SOS Report Analysis
#Applications
chkconfig --list > chkconfig
top -bn1 > top_bn1
service --status-all >
service_status_all date > date
ps auxww > ps_auxww
ps -elf > ps_-elf
rpm -qa --last > rpm-qa
rpm -Va > rpm-Va # this can take a
minute to finish

#memory
free -m > free
vmstat >
vmstat

Continued

Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 20
RHEL6/7 SOS Report Analysis
#Network
ifconfig > ./network/ifconfig
netstat -s >./network/netstat_-s
netstat -agn > ./network/netstat_-agn
netstat -neopa > ./network/netstat_-
neopa route -n > ./network/route_-n
for i in $(ls /etc/sysconfig/network-scripts/
{ifcfg,route,rule}-*) ; do echo -e
"$i\n----------------------------------"; cat
$i;echo "
"; done >
./sos_commands/networking/ifcfg-files
for i in $(ifconfig | grep "^[a-z]" | cut -f 1 -d " "); do echo -e "$i\n-------------------------" ; ethtool $i; ethtool -k
$i; ethtool -S $i; ethtool -i $i; " ";echo -e "\n" ; done > ./sos_commands/networking/ethtool.out
cp /etc/sysconfig/network ./sos_commands/networking/
cp /etc/sysconfig/network-scripts/ifcfg-* ./sos_commands/networking/
cp /etc/sysconfig/network-scripts/route-* ./sos_commands/networking/
cat /proc/net/bonding/bond* > ./sos_commands/networking/proc-net-bonding-bond
iptables --list --line-numbers > ./sos_commands/networking/iptables_--list_--line-
numbers ip route show table all > ./sos_commands/networking/ip_route_show_table_all
ip link > ./sos_commands/networking/ip_link
Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 21
RHEL6/7 SOS Report Analysis
#Storage/Filesystems
df > df
fdisk -l > fdisk
cp -a /etc/fstab ./etc/
cp -a /etc/lvm/lvm.conf
./etc/lvm/ cp -a
/etc/multipath.conf ./etc/ mount
> mount
iostat -tkx 1 10 > iostat_-
tkx_1_10 parted -l >
storage/parted_-l vgdisplay >
storage/vgdisplay lvdisplay >
storage/lvdisplay pvdisplay >
storage/pvdisplay multipath -ll >
storage/multipath_ll pvscan >
storage/pvscan
vgscan > storage/vgscan
lvscan > storage/lvscan
dmsetup info -c > storage/dmsetup_info_c
dmsetup status >
storage/dmsetup_status dmsetup table >
RHEL6/7 SOS Report Analysis
#Kernel
cp -a /etc/security/limits.conf ./etc/
cp -a /etc/sysctl.conf ./etc/
ulimit -a > ulimit
cat /proc/slabinfo > slabinfo
cat /proc/interrupts >
interrupts cat /proc/iomem >
iomem
cat /proc/ioports >
ioports slabtop -o >
slabtop_-o uname -a >
uname
sysctl -a > sysctl_-
a lsmod > lsmod
cp -a
/etc/modprobe.conf
./etc/
cp -a
/etc/sysconfig/*
./etc/sysconfig/
for MOD in `lsmod | grep -v "Used by"| awk '{ print $1 }'`; do modinfo $MOD >> modinfo; done;
Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 24
RHEL6/7 SOS Report Analysis
# Hardware
dmidecode > dmidecode
lspci -vvv > lspci_-vvv
lspci > lspci
cat /proc/meminfo >
meminfo cat /proc/cpuinfo >
cpuinfo

# KDump
cp -a /etc/kdump.conf ./etc/
ls -laR /var/crash > ls-lar-var-
crash
ls -1 /var/crash | while read n; do mkdir -p var/crash/${n}; cp -a /var/crash/${n}/vmcore-
dmesg* var/crash/${n}/; done

# Logs
mkdir -p ./var/log
cp -a /var/log/* ./var/log/
cp -a /etc/*syslog.conf ./etc/
tar cvjf /tmp/sosreport.tar.bz2 ./
Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 24
RHEL6/7 SOS Report Analysis
Run the script

# cd /tmp
# chmod +rwx man_sosreport.sh
# ./man_sosreport.sh

Attach /tmp/sosreport.tar.bz2 to the support case.

Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 25
RHEL6/7 SOS Report Analysis
What are all the identified side effects of sosreport generation?
1. sosreport loads unused kernel modules unintentionally.
Issue
When customer run sosreport, some kernel modules are loaded as a side effect. Loading unwanted kernel modules is
not good from memory consumption point of view and also possible side-effect of module load.
Environment
Red Hat Enterprise Linux 5
Red Hat Enterprise Linux 6.2
Red Hat Enterprise Linux 6.3
Red Hat Enterprise Linux 6.4
Version-Release number of selected component (if applicable): sos-1.7-9.16.el5 openswan-2.6.14-1.el5_2.1 ipvsadm-1.24-8.1, sos-2.2-17.el6_2.3
Resolution
Use the workaround as follows: # sosreport -n openswan
This issue is fixed in the following openswan updates:
RHEL 6.2.z EUS - openswan-2.6.32-13.el6_2.
This package is available vis Errata RHBA-2013-1160
RHEL 6.3.z EUS - openswan-2.6.32-20.el6_3.
This package is available vis Errata RHBA-2013-1161
RHEL 6.4 -: openswan-2.6.32-21.el6_4.
This package is available vis Errata RHBA-2013-1162
Root Cause
When openswan is installed, sosreport loads iptable-related modules including iptable_nat.ko. (With either openswan-2.6.14-1.el5_2.1 from RHEL5.3 or openswan-
2.6.21-5.el5 from RHEL5.4.) It seems openswan plugin of sosreport executes "ipsec barf", which executes iptables and loads modules. When ipvsadm is installed,
sosreport loads ip_vs.ko module. (Reproduced with ipvsadm-1.24-8.1. Not reproducible with ipvsadm-1.24-10) It seems sosreport executes "service --status-all",
which executes "service ipvsadm status" then ipvsadm -L loads the module. When lsmod output contains "nat", "filter" or "mangle" (e.g. "ip6table_filter"), sosreport
loads corresponding iptable_X.ko. collectIPTable() in networking plugin runs "iptables -n nat -L" if lsmod output contains "nat" strings.

Continued
Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 26
RHEL6/7 SOS Report Analysis
sosreport loads the bridge kernel module unintentionally, which sets net.bridge.bridge-nf-call-
arptables = 1
Environment: Red Hat Enterprise Linux 6
Issue
When sosreport is run, some kernel modules are loaded as a side effect; the loading of unwanted kernel modules
consumes memory and may cause other issues
The output of sysctl -a changes after a run of sosreport due to the bridge kernel module being loaded
The bridge module's sysctls are initialized to their defaults, instead of being properly initialized via /etc/sysctl.conf
Despite not having any bridge interfaces configured, you find you have these set in sysctl -a output:
net.bridge.bridge-nf-call-arptables = 1
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
Resolution
A fix for unintentional bridge kernel module loading while sosreport is run is included in sos-3.2-40.el6 (RHBA-2016-
0819) or
later
Workaround: If you are running sosreport for an issue that is not network-related, you may disable the networking plugin
by running sosreport -n networking to avoid this behavior.

Continued

Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 27
RHEL6/7 SOS Report Analysis
Root Cause
sosreport invokes the brctl show command towards the end of its networking module run. The brctl show command
dynamically loads the bridge kernel module, which in turn initializes a bunch of bridge-related sysctls to their module-
default values. Because they are initialized post-boot, nothing is in place to set them to the values specified by the system
administrator in /etc/sysctl.conf.
Diagnostic Steps
$ lsmod | grep bridge
$ sysctl -a | grep bridge-nf
$ brctl show
bridge name bridge id STP enabled interfaces
$ lsmod | grep bridge
bridge 82775 0
stp 2218 1 bridge
llc 5578 2 bridge,stp
$ sysctl -a | grep bridge-nf
net.bridge.bridge-nf-call-arptables = 1
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-filter-vlan-tagged
=0
net.bridge.bridge-nf-filter-pppoe-
tagged = 0
$
Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 28
RHEL6/7 Crash Dump Analysis
What is crash dump analysis?
To determine the cause of the system crash, you can use the crash utility, which provides an interactive
prompt very similar to the GNU Debugger (GDB). This utility allows you to interactively analyze a
running Linux system as well as a core dump created by netdump, diskdump, xendump, or kdump.
How to install/run a crash utility?
To analyze the vmcore dump file, you must have the crash and kernel-debuginfo packages installed.
To install the crash package in your system, type the following at a shell prompt as root:
# yum install crash
To install the kernel-debuginfo package, make sure that you have the yum-utils package installed and run
the following command as root:
# debuginfo-install kernel
Note that in order to use this command, you need to have access to the repository with debugging
packages. If your system is registered with Red Hat Subscription Management, enable the rhel-6-variant-
debug-rpms repository. If your system is registered with RHN Classic, subscribe the system to the rhel-
architecture- variant-6-debuginfo channel.
To start the utility, type the command in the following form at a shell prompt:
# crash /usr/lib/debug/lib/modules/kernel/vmlinux /var/crash/timestamp/vmcore
Note that the kernel version should be the same that was captured by kdump.

Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 29
RHEL6/7 Crash Dump Analysis
Displaying the Message Buffer:
To display the kernel message buffer, type the log command at the interactive prompt. The kernel message buffer includes the most essential information about the
system crash and, as such, it is always dumped first in to the vmcore-dmesg.txt file. This is useful when an attempt to get the full vmcore file failed, for example
because of lack of space on the target location. By default, vmcore-dmesg.txt is located in the /var/crash/ directory.
crash> log
... several lines omitted ...
EIP: 0060:[<c068124f>] EFLAGS: 00010096 CPU: 2
EIP is at sysrq_handle_crash+0xf/0x20
EAX: 00000063 EBX: 00000063 ECX: c09e1c8c EDX: 00000000
ESI: c0a09ca0 EDI: 00000286 EBP: 00000000 ESP: ef4dbf24
DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
Process bash (pid: 5591, ti=ef4da000 task=f196d560
task.ti=ef4da000)
Stack:
c068146b c0960891 c0968653 00000003 00000000 00000002
efade5c0 c06814d0
<0> fffffffb c068150f b7776000 f2600c40 c0569ec4 ef4dbf9c
00000002 b7776000
<0> efade5c0 00000002 b7776000 c0569e60 c051de50 ef4dbf9c f196d560 ef4dbfb4
Call Trace:
[<c068146b>] ? handle_sysrq+0xfb/0x160
[<c06814d0>] ? write_sysrq_trigger+0x0/0x50
[<c068150f>] ? write_sysrq_trigger+0x3f/0x50
[<c0569ec4>] ? proc_reg_write+0x64/0xa0
[<c0569e60>] ? proc_reg_write+0x0/0xa0
[<c051de50>] ? vfs_write+0xa0/0x190
[<c051e8d1>] ? sys_write+0x41/0x70
[<c0409adc>] ? syscall_call+0x7/0xb
Code: a0 c0 01 0f b6 41 03 19 d2 f7 d2 83 e2
03 83 e0 cf c1 e2 04 09 d0 88 41 03 f3 c3 90
c7 05 c8 1b 9e c0 01 00 00 00 0f ae f8 89 f6
<c6> 05 00 00 00 00 01 c3 89 f6 8d bc 27 00
00 00
Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 30
00 8d 50 d0 83
RHEL6/7 Crash Dump Analysis
Displaying a Backtrace:
To display the kernel stack trace, type the bt command at the interactive prompt. You can use bt pid to display the backtrace of the selected
process. crash> bt
PID: 5591 TASK: f196d560 CPU: 2 COMMAND: "bash"
#0 [ef4dbdcc] crash_kexec at c0494922
#1 [ef4dbe20] oops_end at c080e402
#2 [ef4dbe34] no_context at c043089d
#3 [ef4dbe58] bad_area at c0430b26
#4 [ef4dbe6c] do_page_fault at c080fb9b
#5 [ef4dbee4] error_code (via page_fault) at c080d809
EAX: 00000063 EBX: 00000063 ECX: c09e1c8c EDX: 00000000 EBP: 00000000
DS: 007b ESI: c0a09ca0 ES: 007b EDI: 00000286 GS: 00e0
CS: 0060 EIP: c068124f ERR: ffffffff EFLAGS: 00010096
#6 [ef4dbf18] sysrq_handle_crash at c068124f
#7 [ef4dbf24] handle_sysrq at c0681469
#8 [ef4dbf48] write_sysrq_trigger at c068150a
#9 [ef4dbf54] proc_reg_write at c0569ec2
#10 [ef4dbf74] vfs_write at c051de4e
#11 [ef4dbf94] sys_write at c051e8cc
#12 [ef4dbfb0] system_call at c0409ad5
EAX: ffffffda EBX: 00000001 ECX: b7776000 EDX: 00000002
DS: 007b ESI: 00000002 ES: 007b EDI: b7776000
SS: 007b ESP: bfcb2088 EBP: bfcb20b4 GS: 0033
CS: 0073 EIP: 00edc416 ERR: 00000004 EFLAGS:
00000246

Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 31
RHEL6/7 Crash Dump Analysis
Displaying a Process Status:
To display status of processes in the system, type the ps command at the interactive prompt. You can use ps pid to display the
status of the selected process.
crash> ps
PID PPID CPU TASK ST VSZ RSS
%MEM COMM
> 0 0 0 c09dc560 RU 0.0 0 [swapper]
0
> 0 0 1 f7072030 RU 0.0 0 0 [swapper]
0 0 2 f70a3a90 RU 0.0 0 0 [swapper]
> 0 0 3 f70ac560 RU 0.0 0 0 [swapper]
1 0 1 f705ba90 IN 0.0 2828 1424 init
... several lines omitted ...
5566 1 1 f2592560 IN 0.0 12876 784 auditd
5567 1 2 ef427560 IN 0.0 12876 784 auditd
5587 5132 0 f196d030 IN 0.0 11064 3184 sshd
> 5591 5587 2 f196d560 RU 0.0 5084 1648
bash

Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 32
RHEL6/7 Crash Dump Analysis
Displaying Virtual Memory Information:
To display basic virtual memory information, type the vm command at the interactive prompt. You can use vm pid to display information on the selected process.
crash> vm
PID: 5591 TASK: f196d560 CPU: 2 COMMAND: "bash"
MM PGD RSS
TOTAL_VM f19b5900 ef9c6000
START END FLAGS FILE
VMA 5084k
1648k
242000
f1bb0310 260000 8000875 /lib/ld-2.12.so
f26af0b8 260000 261000 8100871 /lib/ld-2.12.so
efbc275c 261000 262000 8100873 /lib/ld-2.12.so
268000 3ed000 8000075 /lib/libc-2.12.so
efbc2a18 3ed000 3ee000 8000070 /lib/libc-2.12.so
3f0000 8100071 /lib/libc-2.12.so
efbc23d8 3ee000 3f1000 8100073 /lib/libc-2.12.so
3f0000
3f4000 100073
efbc2888 3f1000
3f9000 8000075 /lib/libdl-2.12.so
3f6000
3fa000 8100071 /lib/libdl-2.12.so
efbc2cd4 3f9000 3fb000 8100073 /lib/libdl-2.12.so
3fa000
7e6000 7fc000 8000075 /lib/libtinfo.so.5.7

efbc243c 7fc000 7ff000 8100073 /lib/libtinfo.so.5.7

d83000 d8f000 8000075 /lib/libnss_files-2.12.so


efbc28ec d8f000 d90000 8100071 /lib/libnss_files-2.12.so
d90000 d91000 8100073 /lib/libnss_files-2.12.so
efbc2568 edc000 edd000 4040075
efbc2f2c 8047000 8118000 8001875 /bin/bash
f26af888 8118000 811d000 8101873 /bin/bash
f26aff2c 811d000 8122000 100073
f1bb0c70
efbc211c
f26afae0 9fd9000 9ffa000 100073
efbc2504
... several lines omitted ...

efbc2950
Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 33
RHEL6/7 Crash Dump Analysis
Displaying Open Files:
To display information about open files, type the files command at the interactive prompt. You can use files pid to display files
opened by the selected process.
crash> files
PID: 5591 TASK: f196d560 CPU: 2 COMMAND: "bash"
ROOT: / CWD: /root
FD FILE DENTRY INODE TYPE PATH
1 f734f640 eedc2c6c eecd6048 CHR /pts/0
2 efade5c0 eee14090 f00431d4 REG /proc/sysrq-trigger
3 f734f640 eedc2c6c eecd6048 CHR /pts/0
10 f734f640 eedc2c6c eecd6048 CHR /pts/0
255 f734f640 eedc2c6c eecd6048 CHR /pts/0

Exiting the Utility:


To exit the interactive prompt and terminate crash, type exit or q.

Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 34
RHEL6/7 Crash Dump Analysis
Fadump feature with kdump:
Starting with Red Hat Enterprise Linux 6.8 an alternative dumping mechanism to kdump, the firmware-assisted dump
(fadump), is available. The fadump feature is supported only on IBM Power Systems. The goal of fadump is to enable the
dump of a crashed system, and to do so from a fully-reset system, and to minimize the total elapsed time until the system
is back in production use. The fadump feature is integrated with kdump infrastructure present in the user space to
seemlessly switch between kdump and fadump mechanisms.

Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 35
RHEL6/7 Crash Dump Analysis
How to configure kernel crash dumps on a Linux server (kexec/kdump)?
Kexec (abbreviated from kernel execution, and derived from the Unix/Linux kernel call exec) is a
mechanism of the Linux Kernel that allows live booting of a new kernel over the currently running
kernel. Kexec is a fastboot mechanism that allows booting a Linux kernel from the context of an already
running kernel without going through the BIOS.

Kdump is a kexec based crash dumping mechansim for Linux. Kdump functionality is broken mainly
in two components, user space and kernel space. Kernel space patches are already part of main line
kernel tree. User space component is nothing but a patch on top of existing kexec tools.

Pre-requirements before installing Kexec/Kdump:


According Red Hat at least as much disk space as the size of the RAM plus swap memory is needed. A
reboot is a must in order to boot the kernel with the new argument. Need to have installed kexec-
tools.

Installing:
# yum install kexec-tools
Depend on the version, you will need to added the option crashkernel to the kernel command line
parameters in order to reserve memory for the kdump kernel. For RHEL 6 i386 and x86_64 systems,
use crashkernel=128M.

Continued
36
Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written
RHEL6/7 Crash Dump Analysis
Configuring:
Add or change crashkernel option in
/boot/grub/grub.conf.
# grub.conf generated by anaconda
#
# Note that you do not have to rerun grub after making changes to this file
all kernel and initrd paths are relative to /boot/,
# NOTICE:
# You have a /boot partition. This means that
# eg. root (hd0,0)

# kernel /vmlinuz-version ro
root=/dev/mapper/vg_centos-lv_root
#
initrd /initrd-[generic-]version.img
#boot=/dev/sdb
default=0
timeout=5
splashimage=(hd0,0)/grub/splash.xpm.gz
hiddenmenu
title CentOS (2.6.32-431.el6.x86_64)
root (hd0,0)
kernel /vmlinuz-2.6.32-431.el6.x86_64 ro root=/dev/mapper/vg_centos-lv_root rd_NO_LUKS LANG=en_US.UTF-8 rd_LVM_LV=vg_centos/lv_swap
KEYBOARDTYPE=pc KEYTABLE=es rd_NO_MD SYSFONT=latarcyrheb-sun16 crashkernel=128M rd_LVM_LV=vg_centos/lv_root rd_NO_DM rhgb quiet
initrd /initramfs-2.6.32-431.el6.x86_64.img
title CentOS (2.6.32-504.3.3.el6.x86_64)
root (hd0,0)
kernel /vmlinuz-2.6.32-504.3.3.el6.x86_64 ro root=/dev/mapper/vg_centos-lv_root rd_NO_LUKS LANG=en_US.UTF-8
rd_LVM_LV=vg_centos/lv_swap KEYBOARDTYPE=pc KEYTABLE=es rd_NO_MD SYSFONT=latarcyrheb-sun16 crashkernel=auto rd_LVM_LV=vg_centos/lv_root
rd_NO_DM rhgb quiet
initrd /initramfs-2.6.32-504.3.3.el6.x86_64.img

Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 37
RHEL6/7 Crash Dump Analysis
Specifying Kdump Location:
In the file /etc/kdump.conf you can specify the Kdump vmcore location. You can either dump directly to a device, to a file, or to some location on the network via NFS
or SSH. In a virtualization environment the easiest way is added a new disk

At the end of the file you just need to specified the file system type and device name, fs label or the UUID (as in the /etc/fstab), and you may specified a
especific directory where it should be dumped, if you dont, the default path will be /var/crash.

# vi /etc/kdump.conf
##Dumping to a file on Disk
ext3 /dev/sda1

#or
ext4 UUID=c0312488-e91b-4b00-919f-cfbf4c92274e
path /cores

##Dumping to a Network Device using NFS


net <nfs server>:</nfs/mount>

##Dumping to a Network Device using SSH


net <user>@<ssh server>

Boot the kernel with the new argument


# shutdown -r now

Testing:
Warning: This will panic your kernel, killing
all services on the machine
# echo 1 > /proc/sys/kernel/sysrq
# echo "c" > /proc/sysrq-trigger

After theand
Confidential kernel panic,
proprietary theforsystem
materials authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 38
RHEL6/7 Crash Dump Analysis
Userspace debugging

Crashing application
-Through an (unhadled) signal such as SIGABRT, SIGFPE, SIGSEGV, SIGBUS...
-Typically produces a line in kernel log (dmesg)
Example,
modprobe[833]: segfault at 7fff76200038
ip 00007f0de8422fc2 sp 00007fff761b6cb0
error 4 in ld-2.11.1.so[7f0de8420000+20000]
- Will produce a core(5) file if
- Limits allow it (ulimit -c;
/etc/security/limits.conf)
- Binary is readable and not suid...

Application stuck in syscall


- cat /proc/$PID/stack

Executing a program under gdb


- Relies on the ptrace(2) syscall
- gdb /path/to/binary
- (gdb) run $param1 $param2...

Continued

Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 39
RHEL6/7 Crash Dump Analysis
Attaching to a running program
- gdb -p $PID
- We can also create core file (without crashing)
- (gdb) generate-core-file
Inspecting the core file
- gdb /path/to/binary /path/to/core

strace tool for tracing system calls and signals


- Prints system call parameters and return values with symbolic translation.
Example,
open("/foo/bar", O_RDONLY) = -1 ENOENT (No
such file or directory)
- Tries to keep ordering of enter/return between
threads
- Dereferences structure members
- Can attach to a PID
In some cases, strace output has proven to be
more readable than the source.
valgrind tool for finding memory access bugs

Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 40
RHEL6/7 Crash Dump Analysis
Understanding kernel oops/panic output
Printed in console typically on fatal CPU exceptions
-Lots of architecture-specific information
-May contain enough information to figure out the problem without a full crash dump
Oops leaves the system running
- Kills just the current process (including kernel threads!)
-System can still be left inconsistent (locks remain locked ... )
Panic kills the system completely
-Oops in interrupt, with panic on_oops enabled, manual panic() calls such as HW failure, critical memory allocation fail, init/idle
task killed, int. handler killed.
- May trigger crash dump if configured, or reboot after delay.
Setting up the machine to capture an Oops:-
The running kernel should be compiled with CONFIG_DEBUG_INFO, and syslogd should be running.

Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 41
RHEL6/7 Crash Dump Analysis
Soft lockup
CPU spent 20s in kernel without reaching a schedule point.
A warning, unless config/bootparam softlockup panic
enabled. Soft lockup can be harmless, so not good idea in
production. Hard lockup
CPU spent 10s with disabled interrupts.
Detection of both combines several generic mechanisms.
High priority kernel watchdog thread updates soft lockup
timestamp.
hrtimer set to deliver periodic interrupts, increments hard lockup counter and wakes up the watchdog
thread. NMI perf event checks if hrtimers interrupts were processed and if watchdog thread was scheduled.
Hung task check
- INFO: task ... blocked for more than 120 seconds.
- khungtaskd - periodically processes tasks in uninterruptible sleep and checks if their switch count
changed.
RCU stall detector
- Detects when RCU grace period is too long (21s).
-CPU looping in RCU critical section or disabled interrupts, preemption or bottom halves, no scheduling points in non-
preempt kernels
- RT task preempting non-RT task in RCU critical section.

Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 42
RHEL6/7 Crash Dump Analysis
Shortcut keys(sysrq hot keys) for dealing with hangs and security issues:
- Operator's intervention to the running system
- Can be enabled/disabled by /proc/sys/kernel/sysrq
Alt + SysRq + 0 .. 9 set console logging level
Alt + SysRq + H show help
Alt + SysRq + C crash by a NULL pointer dereference
Alt + SysRq + B immediate reboot
Alt + SysRq + O immediate shutdown
Alt + SysRq + S sync all mounted filesystems
Alt + SysRq + U remount all filesystems read-only
Alt + SysRq + J freeze filesystems by FIFREEZE ioctl
Alt + SysRq + P dump registers to console
Alt + SysRq + T dump process information to console
Alt + SysRq + L dump stack traces of running
Alt + SysRq + threads
M Alt + SysRq dump memory statistics to console
+ D Alt + SysRq dump locked locks to console
+ K Alt + SysRq kill all processes on the current console
+ E Alt + SysRq terminate all processes except init
+ I Alt + SysRq kill all processes except init
+ F Alt + SysRq execute the OOM killer
+ N Alt + SysRq reset nice level of all real-time processes
+ R Alt + SysRq switch off raw keyboard mode
+ Q Alt + dump armed hritmers, clockevent
SysRq + V Alt + devices

SysRq + W Alt forcefully restore framebuffer console

+ SysRq + Z dump tasks in uninterruptible sleep


dump the ftrace buffer
Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 43
RHEL6/7 Crash Dump Analysis
How to analyse with vmcore file and various identified errors?
Example, core dump file / vmcore file once after converted from binery to decimal. Use b
inutils package to read the binery core dump files.

Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 44
RHEL6/7 Crash Dump Analysis

Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 45
RHEL6/7 Crash Dump Analysis

Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 46
RHEL6/7 Crash Dump Analysis

Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 47
RHEL6/7 Crash Dump Analysis

Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 48
RHEL6/7 Crash Dump Analysis

Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 49
RHEL6/7 Crash Dump Analysis

Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 50
RHEL6/7 Crash Dump Analysis

Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 51
RHEL6/7 Crash Dump Analysis

Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 52
RHEL6/7 Crash Dump Analysis

Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 53
RHEL6/7 Crash Dump Analysis

Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 54
RHEL6/7 Crash Dump Analysis

Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 55
RHEL6/7 Crash Dump Analysis

Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 56
RHEL6/7 Crash Dump Analysis

Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 57
Overview

Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 58
RHEL6/7 Crash Dump Analysis

Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 59
RHEL6/7 Crash Dump Analysis

Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 60
RHEL6/7 Crash Dump Analysis

Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 61
RHEL6/7 Crash Dump Analysis

Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 62
RHEL6/7 Crash Dump Analysis

Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 63
RHEL6/7 Crash Dump Analysis

Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 64
RHEL6/7 Crash Dump Analysis

Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 65
Introduction to SLES system data and
crash dump analysis
Overview on the tools used in SLES supportconfig data collection including
their installation and execution.
Installation Instructions
SLE12 and Higher
Download the supportutils-3.0-XXX package.
Login as root
Install the supportutils-3.0-*.noarch.rpm
# rpm -Uvh supportutils-3.0-*.noarch.rpm

SLE10 and SLE11


Download the supportutils-1.20-XXX package.
Login as root
Install the supportutils-1.20-*.noarch.rpm
# rpm -Uvh supportutils-1.20-*.noarch.rpm

SLE9
Download the supportutils-1.20le-XXX package.
Login as root
Install the supportutils-1.20le-*.noarch.rpm
# rpm -Uvh supportutils-1.20le-*.noarch.rpm

Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 66
Introduction to SLES system data and
crash dump analysis
Collecting System Information with Supportconfig
To upload a supportconfig to SUSE, run supportconfig -ur $srnum; where $srnum is your service request number. You can also
just run supportconfig for local use. By default, supportconfig saves its information in /var/log/nts_hostname_date_time.tbz.
Creating a Supportconfig Archive with YaST
To use YaST to gather your system information, proceed as follows:
Start YaST and open the Support module.

Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 67
Introduction to SLES system data and
crash dump analysis
Click Create report tarball. In the next window, select one of the supportconfig options from the radio button list. Use Custom (Expert) Settings is pre-
selected by default. If you want to test the report function first, use Only gather a minimum amount of info. For some background information on the other
options, refer to the supportconfig man page. Proceed with Next. Enter your contact information. It will be written to a file called basic-environment.txt and
included in the archive to be created. If you want to submit the archive to Global Technical Support at the end of the information collection process,
Upload Information is required. YaST automatically proposes an upload server. If you want to modify it, refer to Section 2.1.2, Upload Targets for details of
which upload servers are available. If you want to submit the archive later on, you can leave the Upload Information empty for now. Proceed with Next.
The information gathering begins.

Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 68
Introduction to SLES system data and
crash dump analysis
After the process is finished, continue with Next.
Review the data collection: Select the File Name of a log file to view its contents in YaST. To remove any files you want excluded from the TAR archive before submitting
it to support, use Remove from Data. Continue with Next. Save the TAR archive. If you started the YaST module as root user, by default YaST proposes to save the
archive to /var/log (otherwise, to your home directory). The file name format is nts_HOST_DATE_TIME.tbz.
Creating a Supportconfig Archive from Command Line:
The following procedure shows how to create a supportconfig archive, but without submitting it to support directly. For uploading it, you need to run the command
with certain options as described in Submitting Information to Support from Command Line. Open a shell and become root. Run supportconfig without any options.
This gathers the default system information. Wait for the tool to complete the operation. The default archive location is /var/log, with the file name format being
nts_HOST_DATE_TIME.tbz
Common Supportconfig Options:
The supportconfig utility is usually called without any options. Display a list of all options with supportconfig -h or refer to the man page. The following list gives a
brief overview of some common use cases:
Reducing the Size of the Information Being
Gathered Use the minimal option (-m):
# supportconfig -m
Limiting the Information to a Specific Topic
If you have already localized a problem with the default supportconfig output and have found that it relates to a specific area or feature set only, you may want to limit
the collected information to the specific area for the next supportconfig run. For example, if you detected problems with LVM and want to test a recent change that you
did to the LVM configuration, it makes sense to gather the minimum supportconfig information around LVM only:
supportconfig -i LVM
For a complete list of feature keywords that you can use for limiting the collected information to a specific area, run
# supportconfig -F
Including Additional Contact Information in the Output:
# supportconfig -E tux@example.org -N "GSS IER Unix L2" -O "Verizon Enterprise
Solutions" ... (all in one line)
Collecting Already Rotated Log Files
# supportconfig -l
This is especially useful in high logging environments or after a Kernel crash when syslog
rotates the log files after a reboot.

Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 69
Introduction to SLES system data and
crash dump analysis
Overview on the crash dump analysis on SLES including their installation and
execution:
Crash utility analyzes crash dumps and debugs the running system as well. It provides functionality specific to debugging the Linux kernel and is much
more suitable for advanced debugging. If you want to debug the Linux kernel, you need to install its debugging information package in addition. Check if
the package is installed on your system with zypper se kernel | grep debug. To open the captured dump in crash on the machine that produced the
dump, use a command like this:
# crash /boot/vmlinux-2.6.32.8-0.1-default.gz /var/crash/2010-04-23-11\:17/vmcore
The first parameter represents the kernel image. The second parameter is the dump file captured by kdump. You can find this file under /var/crash by
default.
Kernel Binary Formats:-
The Linux kernel comes in Executable and Linkable Format (ELF). This file is usually called vmlinux and is directly generated in the compilation process.
Not all boot loaders, especially on x86 (i386 and x86_64) architecture, support ELF binaries. The following solutions exist on different architectures
supported by SUSE Linux Enterprise Desktop.
x86 (i386 and x86_64)
Mostly for historic reasons, the Linux kernel consists of two parts: the Linux kernel itself (vmlinux) and the setup code run by the boot loader.
These two parts are linked together in a file called bzImage, which can be found in the kernel source tree. The file is now called vmlinuz (note z vs.
x) in the kernel package.
The ELF image is never directly used on x86. Therefore, the main kernel package contains the vmlinux file in compressed form called vmlinux.gz.
To sum it up, an x86 SUSE kernel package has two kernel files:
vmlinuz which is executed by the boot loader.
vmlinux.gz, the compressed ELF image that is required by crash and GDB.
IA64
The elilo boot loader, which boots the Linux kernel on the IA64 architecture, supports loading ELF images (even compressed ones) out of the box. The
IA64 kernel package contains only one file called vmlinuz. It is a compressed ELF image. vmlinuz on IA64 is the same as vmlinux.gz on x86.

Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 70
Introduction to SLES system data and
crash dump analysis
PPC and PPC64
The yaboot boot loader on PPC also supports loading ELF images, but not compressed ones. In the PPC kernel package, there is an ELF Linux
kernel file vmlinux. Considering crash, this is the easiest architecture.
If you decide to analyze the dump on another machine, you must check both the architecture of the computer and the files necessary for
debugging.
You can analyze the dump on another computer only if it runs a Linux system of the same architecture. To check the compatibility, use the command
uname -i on both computers and compare the outputs.
If you are going to analyze the dump on another computer, you also need the appropriate files from the kernel and kernel debug packages.
Put the kernel dump, the kernel image from /boot, and its associated debugging info file from /usr/lib/debug/boot into a single empty directory.
Additionally, copy the kernel modules from /lib/modules/$(uname -r)/kernel/ and the associated debug info files from
/usr/lib/debug/lib/modules/$(uname -r)/kernel/ into a subdirectory named modules.
In the directory with the dump, the kernel image, its debug info file, and the modules subdirectory, launch the crash utility: crash vmlinux-version
vmcore.
Regardless of the computer on which you analyze the dump, the crash utility will produce an output similar to this:
# crash /boot/vmlinux-2.6.32.8-0.1-default.gz /var/crash/2010-04-23-11\:17/vmcore
The command output prints first useful data: There were 42 tasks running at the moment of the kernel crash. The cause of the crash was a SysRq trigger
invoked by the task with PID 9446. It was a Bash process because the echo that has been used is an internal command of the Bash shell.
The crash utility builds upon GDB and provides many useful additional commands. If you enter bt without any parameters, the backtrace of the task
running at the moment of the crash is printed.

Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 71
Introduction to SLES system data and
crash dump analysis

Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 72
Introduction to SLES system data and
crash dump analysis
Now it is clear what happened: The internal echo command of Bash shell sent a character to /proc/sysrq-trigger. After the corresponding handler
recognized this character, it invoked the crash_kexec() function. This function called panic() and kdump saved a dump.
In addition to the basic GDB commands and the extended version of bt, the crash utility defines many other commands related to the structure of the
Linux kernel. These commands understand the internal data structures of the Linux kernel and present their contents in a human readable format.
For example, you can list the tasks running at the moment of the crash with ps. With sym, you can list all the kernel symbols with the corresponding
addresses, or inquire an individual symbol for its value. With files, you can display all the open file descriptors of a process. With kmem, you can
display details about the kernel memory usage. With vm, you can inspect the virtual memory of a process, even at the level of individual page
mappings. The list of useful commands is very long and many of these accept a wide range of options.
The commands that we mentioned reflect the functionality of the common Linux commands, such as ps and lsof. If you would like to find out the
exact sequence of events with the debugger, you need to know how to use GDB and to have strong debugging skills. Both of these are out of the
scope of this document. In addition, you need to understand the Linux kernel. Several useful reference information sources are given at the end of
this document.
Advanced kdump Configuration
The configuration for kdump is stored in /etc/sysconfig/kdump. You can also use YaST to configure it. kdump configuration options are available
under System > Kernel Kdump in YaST Control Center. The following kdump options may be useful for you:
You can change the directory for the kernel dumps with the KDUMP_SAVEDIR option. Keep in mind that the size of kernel dumps can be very large.
kdump will refuse to save the dump if the free disk space, subtracted by the estimated dump size, drops below the value specified by the
KDUMP_FREE_DISK_SIZE option. Note that KDUMP_SAVEDIR understands URL format protocol://specification, where protocol is one of file, ftp,
sftp, nfs or cifs, and specification varies for each protocol. For example, to save kernel dump on an FTP server, use the following URL as a template:
ftp://username:password@ftp.example.com:123/var/crash.
Kernel dumps are usually huge and contain many pages that are not necessary for analysis. With KDUMP_DUMPLEVEL option, you can omit such
pages. The option understands numeric value between 0 and 31. If you specify 0, the dump size will be largest. If you specify 31, it will produce the
smallest dump.
Sometimes it is very useful to make the size of the kernel dump smaller. For example, if you want to transfer the dump over the network, or if you
need to save some disk space in the dump directory. This can be done with KDUMP_DUMPFORMAT set to compressed. The crash utility supports
dynamic decompression of the compressed dumps.
You always need to execute rckdump restart after you make manual changes to /etc/sysconfig/kdump. Otherwise these changes will take effect next
time you reboot the system.

Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 73
Introduction to SLES system data and
crash dump analysis
Step by Step Crash dump generation example in SLES11

Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 74
Introduction to SLES system data and
crash dump analysis

Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 75
Introduction to SLES system data and
crash dump analysis

Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 76
Introduction to SLES system data and
crash dump analysis

Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 77
Introduction to SLES system data and
crash dump analysis

Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 78
Introduction to SLES system data and
crash dump analysis
In the first screen we have useful information, process name, pid, status, cpu, etc etc etc.
Now we can analyze the core dump using backtrace, files, ps, log etc. And do the analysis as long and deep
as desired.

Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 79
Overview on the tools used in Oracle Solaris explorer
data collection including their installation and execution

Oracle Explorer is also a tool like sosreport in RHEL and supportconfig in SLES. Its used to collect system data and helpful in alaysing the detail
system/log information during system crash/fault situations.
The Oracle Explorer is distributed on the Services Tools Bundle (STB) and is made available via its download link.
Use the following procedure to download the latest Services Tools Bundle:
Go to the Oracle Explorer Document Collection web page and read the Oracle Explorer Third Party License Agreement, which explains the terms and
conditions under which the third-party software that is included in Oracle Explorer is available for use.
Go to the STB site at: http://www.sun.com/service/stb/index.jsp and click the Software Download and Documentation link in the Resources
section. In the drop-down lists, select the appropriate Platform and Language for your download.
Review the STB License Agreement and mark the I agree check box to proceed with downloading.
The Sun Download Center might require you to log in before proceeding.
Click install_stb.sh to download the installer.
Few useful questions and answers from Oracle document site:-

Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 80
Overview on the tools used in Oracle Solaris explorer
data collection including their installation and execution

Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 81
Overview on the tools used in Oracle Solaris explorer
data collection including their installation and execution

How to Install/run Oracle Explorer Manually


Oracle Explorer must be installed in the global zone if you are installing it on the Solaris 10 Operating System (Solaris OS). In Solaris 10, the pkgadd
command includes a -g flag that restricts installation to the global zone.
Explorer is a data collection utility that collects system wide information (including the system configuration and logs) for
analysis purpose. Explorer is not installed by default. Explorer is a part of Service tool bundle and can be can be downloaded and installed
form http://www.oracle.com/us/support/systems/premier/services-tools-bundle-sun-systems-163717.html
Services Tools Bundle (STB) version 6.0 is a single self-extracting installer bundle supporting all Sun standard operating systems and
architectures. STB is available is 3 formats for download. You can chose any one of it. I choose to download the install_stb.sh.tar.gz
Once downloaded copy it into a place of your choice and extract it as follows
# gzcat install_stb.sh.tar.gz|tar xvf -
This should extract a file named install_stb.sh. Ensure that the file is marked as executable.
# chmod +x install_stb.sh
and execute the script file
as
# ./ install_stb.sh
This should install the STB along with the explorer. Follow the on screen instructions to complete the
install. Once the installation completes you can verify the install by running
# cd /opt/SUNWexplo/bin
# ./explorer -w default
<date/time> Solaris-Main[3986] exp_check: ERROR Explorer version (EXP_DEF_VERSION) is not present
in explorer defaults file.
<date/time> Solaris-Main[3986] exp_check: ERROR Please run
<explorer_install_directory>/bin/explorer -g to update explorer defaults
file.
<date/time> Solaris-Main[3986] exp_check: FATAL exited: Explorer EXP_DEF_VERSION is not present.
Explorer expects the default files to be present, for the first time run
# cd /opt/SUNWexplo/bin
# ./explorer -g
Confidential
This should and proprietary
return Some materials for authorized
stand dialogs Verizon
about personnel
privacyandand
outside agencies
ask only. Use, disclosure
for questionnaire asorbelow.
distribution
Asofit's
thisamaterial is not permitted
test case on VM toand
any we
unauthorized personsNO
have opted or third
forparties
manyexcept by written
questions. 82
Overview on the tools used in Oracle Solaris explorer
data collection including their installation and execution

Absolute path name for Explorer defaults file?


[/etc/opt/SUNWexplo/default/explorer]:
Company name []: Verizon Enterprise Solutions
Contract ID []: 919-378-4666
System (Solaris-Main, 222adf77) serial number [unknown]: <Serial Number>
Contact name []: GSS.IER.Unix.L2
Contact email address []: anbarasan.x.lienus@intl.verizon.com
Phone number []: 919-378-4666
Address (line 1) []: OTP, Guindy, Chennai
Address (line 2) []:
City []: Chennai
State []: Tamil Nadu
Zip []: 600032
Please select your
geo from this list -
1) AMERICAS
2) EMEA
3) APAC
[APAC]
3
APAC
Please
enter
the two
charact
er
Countr
y code
Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 83
or
Overview on the tools used in Oracle Solaris explorer
data collection including their installation and execution

Automatic Submission
At the completion of explorer, all output may be sent to Sun
or alternate destinations.
Target: https://supportfiles.sun-DOT-com/curl
Send explorer output via HTTPS if -P is specified? Choose
n to specify
an alternate target,such as your SFT ( Sun Secure File Transport )
listener
[y,n] n
When -P is specified, would you like Explorer output to be sent to
an
alternate target destination, such as your SFT (Sun Secure File
Transport) listener?
If yes, enter the http[s]://server:port
If not, enter only a single - for your reply.
[]: n
Invalid target destination format
When -P is specified, would you like
Explorer output to be sent to an alternate
target destination, such as your SFT (Sun
Secure File Transport) listener?
If yes, enter the http[s]://server:port
If not, enter only a single - for your reply.
[n]: n
Invalid target destination format
When -P is specified, would you like
Explorer output to be sent to an
alternate target
Confidential and destination,
proprietary such as
materials for authorized your
Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 84
Overview on the tools used in Oracle Solaris explorer
data collection including their installation and execution

Invalid target destination format


When -P is specified, would you like Explorer output to be sent to an alternate target destination, such as your SFT (Sun Secure File Transport) listener?
If yes, enter the http[s]://server:port
If not, enter only a single - for your reply.
[n]: -
Target: explorer-database-apac-AT-sun.com
Send explorer output via Email if -e is specified?
[y,n] n
If -e is specified, would you like all explorer
output to be sent to alternate email addresses?
If yes, enter the email addresses here (separate
multiple entries with a comma ,).
If not, enter only a single - for your reply.
[]: -
Return address for explorer email output
[abc-AT-abc-DOT-com]:
Email notification when explorer data is
uploaded.
If you would like to be notified by email
when your explorer output
is uploaded into the repository, enter the
email address here.
If notification is not needed, enter a single -.
[]: -
You have answered:
Company name: Verizon Entprise Solutions
Contract ID: 919-378-4666
System and
Confidential serial number:
proprietary <Serial
materials Number>
for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 85
Overview on the tools used in Oracle Solaris explorer
data collection including their installation and execution

Phone number: 919-378-4666


Address (line 1): OTP, Guindy, Chennai
Address (line 2):
City: Chennai
State: Tamil Nadu
Zip: 600032
Country: India
Country Code: IN
Geography: APAC
Post output to:
HTTPS proxy server:
Mail output to:
Mail output from:
nair.padmaraj-AT-
gmail-DOT-com
Mail on data load:
Are these values
okay?
[y,n] y
Do you wish to
schedule explorer in
cron? [y,n] n
<date/time> Solaris-
Main[4044] explorer:
Explorer defaults file
updated.
<date/time> Solaris-
Main[4044] explorer:
Please run
Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 86
Overview on the tools used in Oracle Solaris explorer
data collection including their installation and execution

# ./explorer -w default
and the output file is stored as /opt/SUNWexplo/output/explorer.\*.tar.gz ( \* please see Tips for the output)
Tips:-
a)If you have an earlier version of Explorer installed, it is better to un-install the Old explorer first
b)you can exclude the modules form running by using the ! in the option
For example
explorer -v -w default,!nbu
This will not run the nbu modules
c)f at any time you feel that the explorer has hung or is not
responding you can run the ptree command to see which module is
currently running
This will help you identify where the explorer is hung
d)The output can be found under
cd /opt/SUNWexplo/output
# ls
explorer.222adf77.Solaris-Main-
2016.12.24.07.28

explorer.222adf77.Solaris-Main-
2016.12.24.07.28.tar.gz
Upload the .gz file for Sun
analysis.

Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 87
Overview on the crash dump analysis on Oracle Solaris
including their installation and execution.

Crash tool installing/running:


Download and install SUNWscat Solaris Crash Analyzer Tool
Oracle Solaris Crash Analysis Tool 5.5 Release
root@solaris10 # ls p21099218_55000_Generic.zip
root@solaris10 # unzip p21099218_55000_Generic.zip
Archive: p21099218_55000_Generic.zip
inflating: Readme.txt
inflating: SUNWscat5.5-GA-combined.pkg.gz
root@solaris10 #
root@solaris10 # gunzip SUNWscat5.5-GA-combined.pkg.gz
root@solaris10 #
root@solaris10 # ls
Readme.txt SUNWscat5.5-GA-combined.pkg p21099218_55000_Generic.zip
root@solaris10 # pkgadd -d SUNWscat5.5-GA-combined.pkg

The following packages are available:


1 SUNWscat Oracle Solaris Crash Analysis Tool (5.5 GA)
(any) 5.5

Select package(s) you wish to process (or all to process


all packages). (default: all) [?,??,q]: all

Processing package instance <SUNWscat> from


</export/home/SCAT/SUNWscat5.5-GA-combined.pkg>
.
Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 88
Overview on the crash dump analysis on Oracle Solaris
including their installation and execution.

Installation of <SUNWscat> was successful.

[decompress the crash dump log file:


root@solaris10 # savecore -vf vmdump.0
savecore: System dump time: Mon Aug 10 05:43:52 2015

savecore: saving system crash dump in /opt/crash/solaris10/{unix,vmcore}.0


Constructing namelist /opt/crash/solaris10/unix.0
Constructing corefile /opt/crash/solaris10/vmcore.0
1:15 100% done: 341483 of 341483 pages saved
48444 (14%) zero pages were not written
1:16 dump decompress is done
root@solaris10 #

[Let start analyze the log:


root@solaris10 # cd
/opt/crash/solaris10/
root@solaris10 #
/opt/SUNWscat/bin/scat
vmcore.0

Oracle Solaris Crash Analysis


Tool
Version 5.5 for Oracle Solaris
10 64-bit UltraSPARC

Copyright (c) 1989, 2015,


Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written
Oracle and/or its affiliates. All 89
Overview on the crash dump analysis on Oracle Solaris
including their installation and execution.

Please note: Do not submit any health, payment card or other sensitive
production data that requires protections greater than those specified in
the Oracle GCS Security Practices. Information on how to remove data
from
your submission is available at:
https://support.oracle.com/oip/faces/secure/km/DocumentDisplay.jspx?
id=1227943.1

For support, please use the Oracle Solaris kernel community at


https://community.oracle.com/community/support/oracle_sun_technologies/
Select Subspaces and then Oracle Solaris Performance, Panics,
Hangs, and Dtrace.
Further information may be found at https://blogs.oracle.com/SolarisCAT/

opening unix.0 vmcore.0 dumphdrsymtabmapsdone


loading crashdump data: modulesCTFglobalsdone

crash file: /opt/crash/solaris10/vmcore.0


user: Super-User (root:0)
release: 5.10 (64-bit)
version: Generic_144488-06
machine: sun4v
node name: XXXX
domain:
default.solaris10.com
hw_provider:
Sun_Microsystems
Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 90
system type: SUNW,Netra-T5440 (UltraSPARC-T2+)
Overview on the crash dump analysis on Oracle Solaris
including their installation and execution.

disksdone

[run analyze:

CAT(vmcore.0/10V)> analyze

crash file: /opt/crash/solaris10/vmcore.0


user: Super-User (root:0)
release: 5.10 (64-bit)
version: Generic_144488-06
machine: sun4v
node name: XXX
domain: default.server.com
hw_provider: Sun_Microsystems
system type: SUNW,Netra-
T5440 (UltraSPARC-T2+)
hostid: xxxxxxxx
dump_conflags: 0x10000 (DUMP_KERNEL) on /dev/dsk/c1t0d0s1(8G)
time of crash: Mon Aug 10 05:43:01 WIT 2015
age of system: 37 days 19 hours 54 minutes 4 seconds
panic CPU: 96 (96 CPUs, 15.7G memory, 2 nodes)
panic string: xt_sync: timeout

Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 91
Overview on the crash dump analysis on Oracle Solaris
including their installation and execution.

==== panic thread: 0x300c2653200 ==== CPU: 96 ====


==== panic user (LWP_SYS) thread: 0x300c2653200 PID: 22900 on CPU: 96 ====
cmd: /bin/sh /opt/scripts/xxxxxxx Called from script /opt/scrips >>>>THE ROOT CAUSE OF CRASH/HUNG
t_procp: 0x60024f870b0
p_as: 0x6003a631c18 size: 1769472 RSS: 1482752

.
switch to user threads user stack

analyze is just one of initial investigation command, type help for other commands:

CAT(vmcore.0/10V)> help
CAT(vmcore.0/10V)>

Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 92
Conclusion

This knowledge transfer session and this document would give an introductory information
related to Linux/Solaris system data and crash dump analysis and it would really be helpful on
production support while gathering system fault/crash details during an issue with customer
servers and while communicating to vendor support for more elevated investigation on the
faults/crash happened on the customer servers.

Confidential and proprietary materials for authorized Verizon personnel and outside agencies only. Use, disclosure or distribution of this material is not permitted to any unauthorized persons or third parties except by written 93

You might also like