最後更新: 2019-09-13
介紹
ghettoVCB.sh is a script
backup mediums: LOCAL STORAGE, SAN and NFS
Tested on ESXi 3.5/4.x/5.x/6.x/7.x
1. The script takes snapshots of live running virtual machines,
2. backs up the master VMDK(s) and then upon completion,
3. deletes the snapshot until the next backup.
4. can be setup to run via cron.
* Support VM(s) with existing snapshots ()
* dryrun
* Quick email status summary
* Implemented simple locking mechenism to ensure only 1 instance of ghettoVCB is running per host
Output backup VMDK(s) in either format
- ZEROEDTHICK (default behavior)
- 2GB SPARSE
- THIN
- EAGERZEROEDTHICK
Download: https://github.com/lamw/ghettoVCB
Doc: https://communities.vmware.com/docs/DOC-8760
Installation
Install vib
esxcli software vib install -v /vghetto-ghettoVCB.vib -f
Once installed
ghettoVCB configuration files located in:
/etc/ghettovcb/ghettoVCB.conf /etc/ghettovcb/ghettoVCB-restore_vm_restore_configuration_template /etc/ghettovcb/ghettoVCB-vm_backup_configuration_template
Both ghettoVCB and ghettoVCB-restore scripts are located in:
/opt/ghettovcb/bin/ghettoVCB.sh /opt/ghettovcb/bin/ghettoVCB-restore.sh
Configurations(ghettoVCB.conf)
建立 config Folder
mkdir -p /vmfs/volumes/Backup_4T/ghettoVCB/config
ghettoVCB.conf
ghettoVCB.conf = global ghettoVCB configuration file
1) 設定 backup 到那裡
VM_BACKUP_VOLUME=/vmfs/volumes/backup_disk/VMBackup
2) backup 成什麼 format
# zeroedthick, eagerzeroedthick, thin, and 2gbsparse
DISK_BACKUP_FORMAT=thin
3) Keep 多少份 copy
VM_BACKUP_ROTATION_COUNT=2
4) 開機時照 backup
POWER_VM_DOWN_BEFORE_BACKUP=0
5) Defining whether virtual machine memory is snapped
Memory: If the <memory> flag is 1 or true, a dump of the internal state of the virtual machine is included in the snapshot. Memory snapshots take longer to create, but allow reversion to a running virtual machine state as it was when the snapshot was taken. This option is selected by default. If this option is not selected, and quiescing is not selected, the snapshot will create files which are crash-consistent, which you can use to reboot the virtual machine.
Note: When taking a memory snapshot, the entire state of the virtual machine will be stunned. For more information, see Taking a snapshot with virtual machine memory renders the virtual machine to an inactive state while the memory is written to disk (1013163).
VM_SNAPSHOT_MEMORY=0
6) Quiescing
Quiesce: If the <quiesce> flag is 1 or true, and the virtual machine is powered on when the snapshot is taken,
VMware Tools is used to quiesce the file system in the virtual machine.
Quiescing a file system is a process of bringing the on-disk data of a physical
or virtual computer into a state suitable for backups.
This process might include such operations as flushing dirty buffers from the operating system's in-memory cache to disk,
or other higher-level application-specific tasks.
Note: Quiescing indicates pausing or altering the state of running processes on a computer,
particularly those that might modify information stored on disk during a backup,
to guarantee a consistent and usable backup. Quiescing is not necessary for memory snapshots; it is used primarily for backups.
VM_SNAPSHOT_QUIESCE=0
Mail Settings
# *** Please enable firewall rule for email traffic on port 25 ***
# Defining whether or not to email backup logs
EMAIL_LOG=1
# Defining email server & port:
EMAIL_SERVER=r EMAIL_SERVER_PORT=25 EMAIL_FROM=s@s EMAIL_TO=r@r
Usage Example
# Dry run Mode (no backup will take place)
./ghettoVCB.sh -f /vmfs/volumes/backup_disk/ghettoVCB/vms_to_backup.txt \ -g /vmfs/volumes/backup_disk/ghettoVCB/ghettoVCB.conf \ -l /vmfs/volumes/backup_disk/ghettoVCB/backup.log \ -d dryrun
-g Path to global ghettoVCB configuration file
-d Debug level [info|debug|dryrun] (default: info)
dryrun 可以用來 troubleshoot 問題
2019-07-26 08:32:32 -- dryrun: ###############################################
2019-07-26 08:32:32 -- dryrun: Virtual Machine: srv2012
2019-07-26 08:32:32 -- dryrun: VM_ID: 1
2019-07-26 08:32:32 -- dryrun: VMX_PATH: /vmfs/volumes/datastore1/srv2012/srv2012.vmx
2019-07-26 08:32:32 -- dryrun: VMX_DIR: /vmfs/volumes/datastore1/srv2012
2019-07-26 08:32:32 -- dryrun: VMX_CONF: srv2012/srv2012.vmx
2019-07-26 08:32:32 -- dryrun: VMFS_VOLUME: datastore1
2019-07-26 08:32:32 -- dryrun: VMDK(s):
2019-07-26 08:32:32 -- dryrun: /vmfs/volumes/573c8235-ca82de29-925b-3417ebef4403/srv2012/srv2012_1.vmdk 500 GB
2019-07-26 08:32:32 -- dryrun: srv2012-000001.vmdk 600 GB
2019-07-26 08:32:32 -- dryrun: INDEPENDENT VMDK(s):
2019-07-26 08:32:32 -- dryrun: TOTAL_VM_SIZE_TO_BACKUP: 1100 GB
2019-07-26 08:32:32 -- dryrun: Snapshots found for this VM, please commit all snapshots before continuing!
2019-07-26 08:32:32 -- dryrun: THIS VIRTUAL MACHINE WILL NOT BE BACKED UP DUE TO EXISTING SNAPSHOTS!
# 設定 backup 那些 VM
vms_to_backup.txt
VM1 VM2 VM2
# List vm name by cmd
vim-cmd vmsvc/getallvms
Vmid Name File Guest OS Version Annotation
Remark
# Backup VMs stored in a list
./ghettoVCB.sh -f /etc/ghettovcb/vms_to_backup.txt
# Backup Single VM using command-line
./ghettoVCB.sh -m MyVM
# Backup All VMs residing on specific ESX(i) host
./ghettoVCB.sh -a
Cronjob
Important Note:
Always redirect the ghettoVCB output to /dev/null or to a log when automating via cron.
This becomes very important as one user has identified a limited amount of buffer capacity in which once filled,
may cause ghettoVCB to stop in the middle of a backup.
This primarily only affects users on ESXi, but it is good practice to always redirect the output.
Also ensure you are specifying the FULL PATH when referencing the ghettoVCB script, input or log files.
Backup Script
mkdir /vmfs/volumes/Backup_4T/ghettoVCB
cd /vmfs/volumes/Backup_4T/ghettoVCB
start_backup.sh
# backup script scriptRoot=/vmfs/volumes/backup_disk/ghettoVCB/script ghettoVCB=/opt/ghettovcb/bin/ghettoVCB.sh $ghettoVCB -f $scriptRoot/vms_to_backup.txt \ -g $scriptRoot/ghettoVCB.conf \ -l $scriptRoot/backup.log > /dev/null 2>&1
chmod 755 start_backup.sh
Create cron jobs call Backup Script
# 每星的期日 backup 一次
/bin/echo "0 12 * * 0 /vmfs/volumes/backup_disk/ghettoVCB/start_bak.sh" >> /var/spool/cron/crontabs/root
Notes:
* 注意 ESXi 是用 UTC 時間, 所以 12 即是 HKT 20:00 才開始 backup
* 星期日不是用 "7"
Restart crond on ESXi 5.1
#1 Stop
kill $(cat /var/run/crond.pid)
#2 Check
# -c Display verbose command line
# No output
ps -c | grep [c]rond
#3 Start
crond
Keep crond jobs after reboot
/etc/rc.local.d/local.sh
/bin/kill $(cat /var/run/crond.pid) /bin/echo "0 12 * * 0 /vmfs/volumes/backup_disk/ghettoVCB/start_bak.sh" >> /var/spool/cron/crontabs/root crond
Stopping ghettoVCB Process
Interactively running ghettoVCB:
Step 1 - Press Ctrl+C which will kill off the main ghettoVCB instance
Step 2 - Search for any existing ghettoVCB process by running the following:
ps -c | grep ghettoVCB | grep -v grep
ps -c | grep vmkfstools | grep -v grep
-c Display verbose command line
Step 3 - remove any existing snapshots that may exist on the VM that was being backed up
Toubleshoot
ghettoVCB backup fail
log
2019-09-13 02:33:14 -- info: Initiate backup for vm.myserver 2019-09-13 02:33:14 -- info: Creating Snapshot "ghettoVCB-snapshot-2019-09-13" for vm.myserver Destination disk format: VMFS thin-provisioned Cloning disk '/vmfs/volumes/ADATA-SSD-SU650-480G/myserver/vm.myserver.vmdk'... ^MClone: 10% done.^MClone: 11% done. ... ^MClone: 99% done.^MClone: 100% 2019-09-13 03:30:41 -- info: ERROR: error in backing up of "/vmfs/volumes/SSD/myserver/vm.myserver.vmdk" for vm.myserver 2019-09-13 03:30:43 -- info: Removing snapshot from vm.myserver ... 2019-09-13 03:30:43 -- info: Backup Duration: 57.48 Minutes 2019-09-13 03:30:43 -- info: ERROR: Unable to backup vm.myserver due to error in VMDK backup! ...
Troubleshoot flow:
[1] check image with vmkfstools
# -x --fix [check|repair]
vmkfstools -x check vm.myserver.vmdk
Disk is error free
check 完 file 竟然無問題 @@||
[2] 人手 clone disk 測試
因為 ghettoVCB 是用 snapshot + vmkfstools 去 clone disk,
所以我們人手測試 clone vm snapshot image 先.
... Clone: 100% done.Failed to clone disk: Input/output error (327689).
[Fix]
由於 clone 失敗, 所以我們要把 image 內的 file 到新 image 了
(別用 dd 去 clone !! 因為 image 有問題, 所以 dd 後的新 image 仍會有 FS 問題. 更甚者比原來的舊 image 更差 )
Snapshot found
log file:
2020-08-03 12:47:12 -- info: Snapshot found for myvm_2003 backup will not take place
GUI show without snapshoot
CLI
ls -1
myvm_2003.vmdk myvm_2003-flat.vmdk myvm_2003-000001-delta.vmdk myvm_2003-000001.vmdk
[Fix] create a new snapshot and then use the 'delete all' to clear all the snapshots.
That seems to clear up some partially completed snaps.
Remark
ESXi "Delete All"
Use the Delete All option to delete all snapshots from the Snapshot Manager.
Delete all consolidates and writes the changes that occur between snapshots and
the previous delta disk states to the base parent disk and merges them with the base virtual machine disk.