ddrescue

最後更新: 2019-10-20

介紹

 

HomePage: http://www.gnu.org/software/ddrescue/

If you use the logfile feature of ddrescue, the data is rescued very efficiently, (only the needed blocks are read).

Also you can interrupt the rescue at any time and resume it later at the same point.

ddrescue does not write zeros to the output when it finds bad sectors in the input,

(所以當不理會 bad sector 時 zero 那位置一次)

and does not truncate the output file if not asked to.

So, every time you run it on the same output file, it tries to fill in the gaps without wiping out the data already rescued.

ddrescue manages efficiently the status of the rescue in progress and tries to rescue the good parts first,

scheduling reads inside bad (or slow) areas for later.

This maximizes the amount of data that can be finally recovered from a failing drive.

* If the damaged drive is not listed in /dev, then you cannot rescue it. At least not with ddrescue.

 


Install

apt-get install ddrescue

 

Check Version

ddrescue -V

GNU ddrescue 1.16

Usage

ddrescue [options] infile outfile [logfile]

-i <bytes>                   # starting position in input file (Default 0)

-s <bytes>                  # maximum size of input data to be copied

-e maxerr                    # exit after maxerr errors ([new] error)

-b softbs                      # sector size of input device [Default 512]

-f, --force                     # 當目標是 Device 時就要加它

-A, --try-again             # mark non-split, non-trimmed blocks as non-tried

-p, --preallocate          # preallocate space on disc for output file

Unit

s = sectors, k = 1000, Ki = 1024, M = 10^6, Mi = 2^20

 


Example

 

<0> 將 sda3 變成 image

dd_rescue /dev/sda3 /media/backup/sda3.img ddrescue.log

Output

dd_rescue: (info): ipos:    580608.0k, opos:    580608.0k, xferd:    580608.0k          
+curr.rate:    31706kB/s, avg.rate:      999kB/s, avg.load:  1                   
errs:      0, errxfer:         0.0k, succxfer:    580608

dd_rescue: (info): ipos:    581632.0k, opos:    581632.0k, xferd:    581632.0k          
+curr.rate:    32043kB/s, avg.rate:     1000kB/s, avg.load:  1                   
errs:      0, errxfer:         0.0k, succxfer:    581632

<1> Disk to Disk Clone

# sda -> sdz

ddrescue -f -n /dev/sda /dev/sdz /root/ddrescue.log

ddrescue -f -d -r 3 /dev/sda /dev/sdz /root/ddrescue.log

# 用 zero 去清有問題的 Block

ddrescue -f /dev/zero /dev/sdz /root/ddrescue.log

Remark

'-n'                                    # do not try to split or retry failed blocks

'-d'                                    # use direct disc access for input file

'-f'                                     # Force overwrite of outfile. Needed when outfile is not a regular file, but a device or partition.

<2> zip 起它 / transfer to remote

dd_rescue /dev/sda1 - | bzip2 > /dir/file.img.bz2

dd_rescue /dev/sda1 - | ssh user@remote.host "cat - > /remote/destination/file.img"

<3> 分兩 Part recovery

a) 快速初步 recovery

     ddrescue -i0 -s50MiB /dev/hdc hdimage logfile

b) Now rescue the rest (does not recopy what is already done).

     ddrescue /dev/hdc hdimage logfile

     ddrescue -d -r3 /dev/hdc hdimage logfile

GNU ddrescue 1.16
Press Ctrl-C to interrupt
rescued:         0 B,  errsize:       0 B,  current rate:        0 B/s
   ipos:         0 B,   errors:       0,    average rate:        0 B/s
   opos:         0 B,     time since last successful read:       0 s
Copying non-tried blocks...
  • 'B' for "byte"

The total error size ('errsize') = non-trimmed + non-scraped + bad-sector blocks

 


Rescue some key disc areas

 

Opts

'-i bytes' / '--input-position=bytes'

Starting position of the rescue domain in infile, in bytes. Defaults to 0.  

'-s bytes' / '--size=bytes'

Maximum size of the rescue domain, in bytes. It limits the amount of input data to be copied.

i.e.

# 由 0 byte 位置開始 copy 200G

ddrescue -d -i -s 200G -r 2 /dev/sde sde.img ddrescue.log

# 由 30Gib 位置開始 copy 10GiB

ddrescue -i30GiB -s10GiB /dev/hdc hdimage logfile


Direct disc access (-d)

 

If you notice that the positions and sizes in the logfile are ALWAYS multiples of the sector size,

maybe your kernel is caching the disc accesses and grouping them.

In this case you may want to use direct disc access to bypass the kernel cache and rescue more of your data.

NOTE! Sector size must be correctly set with the '--sector-size' option for this to work.

# fast reading first
ddrescue -f -n /dev/hdb1 /dev/hdc1 logfile

# slower than normal cached reading
ddrescue -f -d -r3 /dev/hdb1 /dev/hdc1 logfile

# fix & mount
e2fsck -v -f /dev/hdc1
mount -t ext2 -o ro /dev/hdc1 /mnt

'-r n' '--retry-passes=n'       # Exit after given number of retry passes. Defaults to 0 (-1=infinity)

使用後由

[696330.055065] sd 7:0:0:0: [sde] tag#0 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[696330.055069] sd 7:0:0:0: [sde] tag#0 Sense Key : Medium Error [current]
[696330.055072] sd 7:0:0:0: [sde] tag#0 Add. Sense: Unrecovered read error
[696330.055076] sd 7:0:0:0: [sde] tag#0 CDB: Read(10) 28 00 20 88 22 30 00 00 08 00
[696330.055078] print_req_error: critical medium error, dev sde, sector 545792560
[696330.055089] Buffer I/O error on dev sde, logical block 68224070, async page read
[696336.077706] sd 7:0:0:0: [sde] tag#0 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[696336.077711] sd 7:0:0:0: [sde] tag#0 Sense Key : Medium Error [current]
[696336.077714] sd 7:0:0:0: [sde] tag#0 Add. Sense: Unrecovered read error
[696336.077717] sd 7:0:0:0: [sde] tag#0 CDB: Read(10) 28 00 20 88 22 30 00 00 08 00
[696336.077720] print_req_error: critical medium error, dev sde, sector 545792560
[696336.077731] Buffer I/O error on dev sde, logical block 68224070, async page read

變成 (沒有了 "Buffer I/O error" 及重複一次 )

[696419.260665] sd 7:0:0:0: [sde] tag#0 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[696419.260669] sd 7:0:0:0: [sde] tag#0 Sense Key : Medium Error [current]
[696419.260672] sd 7:0:0:0: [sde] tag#0 Add. Sense: Unrecovered read error
[696419.260675] sd 7:0:0:0: [sde] tag#0 CDB: Read(10) 28 00 20 8a 97 00 00 00 80 00
[696419.260678] print_req_error: critical medium error, dev sde, sector 545953536

 


re-try

 

ddrescue -A --retrim -f -i0 -s300GiB  /dev/sdc /dev/sdd /home/logfile

'-A'   ('--try-again')

Mark all non-trimmed and non-scraped blocks inside the rescue domain as non-tried before beginning the rescue.

Try this if the drive stops responding and ddrescue immediately starts scraping failed blocks when restarted.

If '--retrim' is also specified, mark all failed blocks inside the rescue domain as non-tried.

'--retrim'

Mark all failed blocks inside the rescue domain as non-trimmed before beginning the rescue.

 


Skip

 

Phase

-n, --no-scrape                                     # skip the scraping phase

-N, --no-trim                                        # skip the trimming phase

Size

-C, --complete-only                              # don't read new data beyond mapfile limits

-K, --skip-size=[<i>][,<max>]          # initial,maximum size to skip on read error

-b, --sector-size=<bytes>              sector size of input device [default 512]

-c, --cluster-size=<sectors>              sectors to copy at a time [128]

 


Logfile structure

 

Example:

# Rescue Logfile. Created by GNU ddrescue version 1.16
# Command line: ddrescue -f -d -r3 /dev/sdd /dev/sdc /home/logfile
# current_pos  current_status
0x48B70000     ?
#      pos        size  status
0x00000000  0x2D778000  +
0x2D778000  0x00008000  *
0x2D780000  0x0045C000  +
0x2DBDC000  0x00004000  *
0x2DBE0000  0x00000200  -
0x2DBE0200  0x00010000  *
0x2DBF0200  0x1AD8EE00  +
0x4897F000  0x00001000  *
0x48980000  0x00200000  +
0x48B80000  0x1D176580000  ?

First block

Character     Meaning

  • '?'     copying non-tried blocks
  • '*'     trimming non-trimmed blocks
  • '/'     scraping non-scraped blocks
  • '-'     retrying bad sectors
  • 'F'     filling specified blocks
  • 'G'     generating approximate logfile
  • '+'     finished

The blocks in the list of data blocks

Character     Meaning

  • '?'     # non-tried block
  • '*'     # failed block non-trimmed
  • '/'     # failed block non-scraped
  • '-'     # failed block bad-sectors (bad blocks)
  • '+'    # finished blocks (good sectors)

 


Example: Recovery CD

 

# recovery

ddrescue -n -b2048 /dev/sr0 cd.iso mapfile

ddrescue -d -r 3 -b2048 /dev/sr0 cd.iso mapfile

remark

-b # sector size of input device [default 512]

在 CD 時一定係 -b2048

如果 errsize 係 0 咁果隻 CD 就救返了.

# Identifying Discs (iso image)

apt-get install genisoimage

isoinfo -d -i /dev/sr0

-d                     # Print information from the primary volume descriptor (PVD)  of  the  iso9660  image.

-i iso_image

# recovery content from CD

mount -o ro,loop -t iso9660 image.iso /mnt/mountpoint

 


Fill Mode (--fill)

 

if you use the "--fill" option, ddrescue does not rescue anything.

fill blocks of given types with data (?*/-+l)

Note that in fill mode the input file is always read from position 0.

In fill mode the input file may have any size.

If it is too small, the data will be duplicated as many times as necessary to fill the input buffer.

If it is too big, only the needed data will be read.

i.e.

# fills all areas marked as ‘-’ (bad hardware blocks) with copies of the string "BAD BLOCK"

echo -n "BaDB1K~!" > tmpfile

ddrescue --fill='-' tmpfile sde.img ddrescue.log
ddrescue --fill='*' tmpfile sde.img ddrescue.log
ddrescue --fill='/' tmpfile sde.img ddrescue.log
ddrescue --fill='?' tmpfile sde.img ddrescue.log

Figure out currupt file

1) Copy the damaged drive with ddrescue until finished.

    Do not use sparse writes. This yields a logfile with only finished (‘+’) and bad (‘-’) areas.

    -S, --sparse               # use sparse writes for output file

2) Mount the copied drive (or the image file, via loopback device).

3) Compute a md5sum or other checksum for every file.

    Build a list of all the files and their checksums.

4) Fill the bad areas of the copied drive or image file with a byte value different from zero.

5) Verify the checksums. Those files which have different checksums this time reside (at least partially) in damaged disk areas.

 


Recovery 的過程

 

Recovery 的過程一共有 3 個 Phase, 最終只有 Good ("+") / Bad Block ("-")

流程: non-tried -> non-trimmed -> non-scraped -> bad-sector

  • non-tried         # Size of the part of the rescue domain pending to be tried.
  • non-trimmed    # Size of the part of the rescue domain pending to be trimmed.
  • non-scraped     # Size of the part of the rescue domain pending to be scraped.

First phase - Copying

Copying is done in up to 5 passes.

The first pass

It reads the non-tried parts of the input file, marking the failed blocks as non-trimmed and skipping beyond them.

The second pass

It delimits the blocks skipped by the first pass.

The third and fourth passes

It read the blocks skipped due to slow areas (if any) by the first two passes, in the same direction that each block was skipped.

For each block, passes 2 to 4 skip the rest of the block after finding the first error in the block.

The last pass

It is a sweeping pass, with skipping disabled.

 

The copying direction is reversed after each pass until all the rescue domain is tried.

Only non-tried areas are read in large blocks. Trimming, scraping and retrying are done sector by sector.

 

Each sector is tried at most two times; the first in this phase as part of a large block read,

the second in one of the phases below as a single sector read.

 

The purpose of the multiple passes is to delimit large bad areas fast,

recover the most promising areas first, keep the mapfile small,

and produce good starting points for trimming.

 

Second phase - Trimming

Trimming is done in one pass.

For each non-trimmed block, read forwards one sector at a time from the leading edge of the block until a bad sector is found.

Then read backwards one sector at a time from the trailing edge of the block until a bad sector is found.

Then mark the bad sectors found (if any) as bad-sector, and mark the rest of the block as non-scraped without trying to read it.

 

Third phase - Scraping

Scrape together the data not recovered by the copying or trimming phases. Scraping is done in one pass.

Each non-scraped block is read forwards, one sector at a time. Any bad sectors found are marked as bad-sector.