kdump

最後更新: 2017-7-20

原理

When a kernel panic occurs, the kernel relies on the kexec mechanism to quickly reboot a new instance of the kernel in a pre-reserved section of memory that had been allocated when the system booted (see below).

This permits the existing memory area to remain untouched in order to safely copy its contents to storage.

Package

kdump-tools - scripts and tools for automating kdump (Linux crash dumps)

kdump-tools

  • kexec-tools
  • makedumpfile
  • crash

 


Install

 

apt-get install kdump-tools

Generating /etc/default/kexec...
Generating grub.cfg ...
Found linux image: /boot/vmlinuz-3.2.0-23-generic-pae
Found initrd image: /boot/initrd.img-3.2.0-23-generic-pae
Found memtest86+ image: /boot/memtest86+.bin
done
update-rc.d: warning: kdump start runlevel arguments (2) do not match LSB Default-Start values (0 1 2 3 4 5)
update-rc.d: warning: kdump stop runlevel arguments (none) do not match LSB Default-Stop values (6)
Processing triggers for initramfs-tools ...
update-initramfs: Generating /boot/initrd.img-3.2.0-23-generic-pae

獲得

  • /usr/sbin/kdump-config
  • /etc/default/kdump-tools
  • /etc/init.d/kdump-tools

P.S.

要 reboot 一次才用到 (kdump-config status)!!

 

 


kdump-tools configure

 

/etc/default/kdump-tools

# 1=enable
USE_KDUMP=1
KDUMP_COREDIR="/var/crash"
# save vmcore fails action
KDUMP_FAIL_CMD="reboot -f"

kdump-config status     ( /sys/kernel/kexec_crash_loaded )

 

# Verification

kdump-config show

USE_KDUMP:        1
KDUMP_SYSCTL:     kernel.panic_on_oops=1
KDUMP_COREDIR:    /var/crash
crashkernel addr: 0x1a000000
current state:    ready to kdump

kernel link:


kexec command:
  /sbin/kexec -p --command-line="root=UUID=f3270d03-aae9-414a-9e3e-112392be09e1 ro  irqpoll maxcpus=1 nousb" --initrd=/boot/initrd.img-3.2.0-23-generic-pae /boot/vmlinuz-3.2.0-23-generic-pae

 

dmesg | grep -i crash

[    0.000000] Reserving 64MB of memory at 400MB for crashkernel (System RAM: 499MB)
[    0.000000] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-3.2.0-23-generic-pae root=UUID=f3270d03-aae9-414a-9e3e-112392be09e1 ro crashkernel=384M-2G:64M,2G-:128M

 

# testing

kdump-config test

output

* no crashkernel= parameter in the kernel cmdline
Could not find an installed debug vmlinux image and
DEBUG_KERNEL is not specified in /etc/default/kdump-tools
 * makedumpfile may be limited to -d 1
USE_KDUMP:         1
KDUMP_SYSCTL:      kernel.panic_on_oops=1
KDUMP_COREDIR:     /var/crash
crashkernel addr:
kdump kernel addr: relocatable
kdump kernel:
   /boot/vmlinuz-3.2.0-23-generic-pae
kdump initrd:
  /boot/initrd.img-3.2.0-23-generic-pae
debug kernel:

kexec command to be used:
  /sbin/kexec -p --command-line="BOOT_IMAGE=/boot/vmlinuz-3.2.0-23-generic-pae 
    root=UUID=f3270d03-aae9-414a-9e3e-112392be09e1 ro irqpoll maxcpus=1 nousb" 
    --initrd=/boot/initrd.img-3.2.0-23-generic-pae /boot/vmlinuz-3.2.0-23-generic-pae

# cat /proc/cmdline
# 沒有  crashkernel=384M-2G:64M,2G-:128M
# crashkernel=<range1>:<size1>[,<range2>:<size2>,...][@offset]
# if the RAM is smaller than 384M, then don't reserve anything
# if the RAM size is between 386M and 2G (exclusive), then reserve 64M
# if the RAM size is larger than 2G, then reserve 128M

# Could not find an installed debug vmlinux image and
# DEBUG_KERNEL is not specified in /etc/default/kdump-tools

* kdump-config will use /usr/lib/debug/vmlinux-$(uname -r) if it is available.  
* If it is not available, makedumpfile will be limited to dumping all pages in memory

# To check whether the crash kernel is already loaded
cat /sys/kernel/kexec_crash_loaded

load     - Locate the kdump kernel
(/etc/init.d/kdump start)

unload   - unload the kdump kernel using kexec

savecore  (save /proc/vmcore)

 

Real Crash Testing

echo c > /proc/sysrq-trigger

# Once booted into dump capture kernel

cp /proc/vmcore /root/crash.dump

 

ls /var/crash

linux-image-3.0.0-12-server.0.crash

 


kdump

 

kdump [options] start_address

# based on Kexec
# Kdump utilizes two kernels: system kernel and dump capture kernel.

# dump capture kernel is booted
# /dev/vmcore file to get access to memory of crashed system kernel

config file

CONFIG_DEBUG_INFO=y
CONFIG_CRASH_DUMP=y
CONFIG_PROC_VMCORE=y

# /proc/cmdline

crashkernel=384M-2G:64M,2G-:128M

# Checking

dmesg | grep -i crash

[    0.000000] Reserving 64MB of memory at 800MB for crashkernel (System RAM: 1023MB)

 


Analyzing core dump

 

As soon as the kernel crash, a second kernel will boot immediatelly

after that the dump is created in the folder /var/crash/

 * Then the system reboots again into the normal mode.

Analyzing by crash

crash /path/to/vmlinux /path/to/crash.dump

# 之後會進入 crash shell
crash> help

 


Ubuntu 12.04 bug

 


makedumpfile

 

# make a small dumpfile of kdump

-d dump_level

  dump | zero | cache|cache  | user | free
 level | page | page |private| data | page
-------+------+------+-------+------+------
     0 |      |      |       |      |
     1 |  X   |      |       |      |
     2 |      |   X  |       |      |
.............

 


Centos 7

 

/etc/default/grub

GRUB_CMDLINE_LINUX="crashkernel=auto rd.lvm.lv=cl/root rd.lvm.lv=cl/swap rhgb quiet"

改成

GRUB_CMDLINE_LINUX="rd.lvm.lv=cl/root rd.lvm.lv=cl/swap rhgb quiet"

grub2-mkconfig -o /boot/grub2/grub.cfg

systemctl disable kdump

 

service kdump status

....
Sep 19 15:56:08 seafile.local systemd[1]: Starting Crash recovery kernel arming...
Sep 19 15:56:09 seafile.local kdumpctl[955]: No memory reserved for crash kernel
Sep 19 15:56:09 seafile.local kdumpctl[955]: Starting kdump: [FAILED]
....

 


DOC

 

https://wiki.ubuntu.com/Kernel/CrashdumpRecipe