最後更新: 2017-7-20
原理
When a kernel panic occurs, the kernel relies on the kexec mechanism to quickly reboot a new instance of the kernel in a pre-reserved section of memory that had been allocated when the system booted (see below).
This permits the existing memory area to remain untouched in order to safely copy its contents to storage.
Package
kdump-tools - scripts and tools for automating kdump (Linux crash dumps)
kdump-tools
- kexec-tools
- makedumpfile
- crash
Install
apt-get install kdump-tools
Generating /etc/default/kexec... Generating grub.cfg ... Found linux image: /boot/vmlinuz-3.2.0-23-generic-pae Found initrd image: /boot/initrd.img-3.2.0-23-generic-pae Found memtest86+ image: /boot/memtest86+.bin done update-rc.d: warning: kdump start runlevel arguments (2) do not match LSB Default-Start values (0 1 2 3 4 5) update-rc.d: warning: kdump stop runlevel arguments (none) do not match LSB Default-Stop values (6) Processing triggers for initramfs-tools ... update-initramfs: Generating /boot/initrd.img-3.2.0-23-generic-pae
獲得
- /usr/sbin/kdump-config
- /etc/default/kdump-tools
- /etc/init.d/kdump-tools
P.S.
要 reboot 一次才用到 (kdump-config status)!!
kdump-tools configure
/etc/default/kdump-tools
# 1=enable USE_KDUMP=1 KDUMP_COREDIR="/var/crash" # save vmcore fails action KDUMP_FAIL_CMD="reboot -f"
kdump-config status ( /sys/kernel/kexec_crash_loaded )
# Verification
kdump-config show
USE_KDUMP: 1 KDUMP_SYSCTL: kernel.panic_on_oops=1 KDUMP_COREDIR: /var/crash crashkernel addr: 0x1a000000 current state: ready to kdump kernel link: kexec command: /sbin/kexec -p --command-line="root=UUID=f3270d03-aae9-414a-9e3e-112392be09e1 ro irqpoll maxcpus=1 nousb" --initrd=/boot/initrd.img-3.2.0-23-generic-pae /boot/vmlinuz-3.2.0-23-generic-pae
dmesg | grep -i crash
[ 0.000000] Reserving 64MB of memory at 400MB for crashkernel (System RAM: 499MB) [ 0.000000] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-3.2.0-23-generic-pae root=UUID=f3270d03-aae9-414a-9e3e-112392be09e1 ro crashkernel=384M-2G:64M,2G-:128M
# testing
kdump-config test
output
* no crashkernel= parameter in the kernel cmdline Could not find an installed debug vmlinux image and DEBUG_KERNEL is not specified in /etc/default/kdump-tools * makedumpfile may be limited to -d 1 USE_KDUMP: 1 KDUMP_SYSCTL: kernel.panic_on_oops=1 KDUMP_COREDIR: /var/crash crashkernel addr: kdump kernel addr: relocatable kdump kernel: /boot/vmlinuz-3.2.0-23-generic-pae kdump initrd: /boot/initrd.img-3.2.0-23-generic-pae debug kernel: kexec command to be used: /sbin/kexec -p --command-line="BOOT_IMAGE=/boot/vmlinuz-3.2.0-23-generic-pae root=UUID=f3270d03-aae9-414a-9e3e-112392be09e1 ro irqpoll maxcpus=1 nousb" --initrd=/boot/initrd.img-3.2.0-23-generic-pae /boot/vmlinuz-3.2.0-23-generic-pae
# cat /proc/cmdline
# 沒有 crashkernel=384M-2G:64M,2G-:128M
# crashkernel=<range1>:<size1>[,<range2>:<size2>,...][@offset]
# if the RAM is smaller than 384M, then don't reserve anything
# if the RAM size is between 386M and 2G (exclusive), then reserve 64M
# if the RAM size is larger than 2G, then reserve 128M
# Could not find an installed debug vmlinux image and
# DEBUG_KERNEL is not specified in /etc/default/kdump-tools
* kdump-config will use /usr/lib/debug/vmlinux-$(uname -r) if it is available.
* If it is not available, makedumpfile will be limited to dumping all pages in memory
# To check whether the crash kernel is already loaded
cat /sys/kernel/kexec_crash_loaded
load - Locate the kdump kernel
(/etc/init.d/kdump start)
unload - unload the kdump kernel using kexec
savecore (save /proc/vmcore)
Real Crash Testing
echo c > /proc/sysrq-trigger
# Once booted into dump capture kernel
cp /proc/vmcore /root/crash.dump
ls /var/crash
linux-image-3.0.0-12-server.0.crash
kdump
kdump [options] start_address
# based on Kexec
# Kdump utilizes two kernels: system kernel and dump capture kernel.
# dump capture kernel is booted
# /dev/vmcore file to get access to memory of crashed system kernel
config file
CONFIG_DEBUG_INFO=y CONFIG_CRASH_DUMP=y CONFIG_PROC_VMCORE=y
# /proc/cmdline
crashkernel=384M-2G:64M,2G-:128M
Checking
dmesg | grep -i crash
[ 0.000000] Reserving 64MB of memory at 800MB for crashkernel (System RAM: 1023MB)
Analyzing core dump
As soon as the kernel crash, a second kernel will boot immediatelly
after that the dump is created in the folder /var/crash/
* Then the system reboots again into the normal mode.
Analyzing by crash
crash /path/to/vmlinux /path/to/crash.dump
# 之後會進入 crash shell crash> help
Ubuntu 12.04 bug
makedumpfile
# make a small dumpfile of kdump
-d dump_level
dump | zero | cache|cache | user | free level | page | page |private| data | page -------+------+------+-------+------+------ 0 | | | | | 1 | X | | | | 2 | | X | | | .............
Centos 7
/etc/default/grub
GRUB_CMDLINE_LINUX="crashkernel=auto rd.lvm.lv=cl/root rd.lvm.lv=cl/swap rhgb quiet"
改成
GRUB_CMDLINE_LINUX="rd.lvm.lv=cl/root rd.lvm.lv=cl/swap rhgb quiet"
grub2-mkconfig -o /boot/grub2/grub.cfg
systemctl disable kdump
service kdump status
.... Sep 19 15:56:08 seafile.local systemd[1]: Starting Crash recovery kernel arming... Sep 19 15:56:09 seafile.local kdumpctl[955]: No memory reserved for crash kernel Sep 19 15:56:09 seafile.local kdumpctl[955]: Starting kdump: [FAILED] ....
Doc
https://wiki.ubuntu.com/Kernel/CrashdumpRecipe