Environment
SUSE Linux Enterprise Desktop 10
SUSE Linux Enterprise Server 11
SUSE Linux Enterprise High Availability Extension
SUSE Linux Enterprise Real Time Extension
SLES Expanded Support Platform
SUSE Linux Enterprise Desktop 12
Situation
Resolution
- When did the crash occur?
Please provide the exact time and date. - What is the system main task?
- Was this a one time crash or did the system encounter this issue several times?
In case the system crashed several times please provide all known occurrences. - At the time the system crashed, were any particular log entries noticed?
- In case no entries can be found in /var/log/messages, were any entries written to the logs of the hardware management board?
- What was the situation on the system before it crashed?
Please report any observation like an increase e.g. in CPU/RAM usage or high I/O wait.
What kind of system data is needed by Customer Care?
SUSE Customer Support uses for troubleshooting a tool called supportutils ( https://www.suse.com/c/free_tools/supportconfig-linux/). In order to create a system report, please run as root
supportconfig -l
This
will collect all relevant system data (even older, already rotated
messages files) and create a compressed file in /var/log with the
following file name:
nts_$HOSTNAME_$DATE_$TIME.tbz
Please
always run the most recent version of supportutils for better results
and append this file to the service request. If outbound FTP traffic has been allowed in the corporate firewall, the archive may get uploaded directly to the service request using
supportconfig -lur <11digit servicerequest number>
For SUSE Expanded Support based systems please provide a sosreport.
In
case the crash happens in a clustered environment (Novell Cluster
Services or SLE11 High Availability Extension) please provide a system
report for all involved nodes.
Steps to trace system reboots
Kernel Core Dump capture
If a system crashes, the possibility of capturing a kernel core dump is given using kdump. Its configuration is explained in TID 3374462 - Configure kernel core dump capture. A best practices document about providing kernel core dumps to Customer Care is available at TID 7010056 - Best practice for providing kernel core dumps to support incidents.
For SLES Expanded Support based system please consult the corresponding online documentation for RHEL5 or RHEL6 on configuring kdump.
Please note: kernel core dumps must have been written completely to the dump device. To ensure this is the case, set KDUMP_IMMEDIATE_REBOOT to "yes" in /etc/sysconfig/kdump and wait for the system to reboot itself. Note that cores can be very large, so this may take a while. Forcing a reboot manually could interrupt the writing and result in an incomplete core. If the dump is incomplete for whatever reason an analysis will not be possible.
Additional information
sysstat is a tool which collects system data (e.g. CPU, RAM, i/o usage) in regular intervals. Its output is also a valuable source of information when it comes to troubleshooting crash situations. Please consider to install the package sysstat and enable its service by using
/etc/init.d/boot.sysstat start
If this service is activated before the system crashes, supportconfig and sosreport will include its output into the system report.