ddrescue

最後更新: 2024-06-25

目錄

 


介紹

HomePage: http://www.gnu.org/software/ddrescue/

----

ddrescue is not a derivative of dd

----

If you use the logfile feature of ddrescue, the data is rescued very efficiently, (only the needed blocks are read).

Also you can interrupt the rescue at any time and resume it later at the same point.

So, every time you run it on the same output file, it tries to fill in the gaps without wiping out the data already rescued.

ddrescue does not write zeros to the output when it finds bad sectors in the input,
and does not truncate the output file if not asked to.
(所以當不理會 bad sector 時 zero 那位置一次)

----

ddrescue manages efficiently the status of the rescue in progress and tries to rescue the good parts first,
scheduling reads inside bad (or slow) areas for later.

This maximizes the amount of data that can be finally recovered from a failing drive.

* If the damaged drive is not listed in /dev, then you cannot rescue it. At least not with ddrescue.

----

Newer versions try to minimize head movement to minimize drive damage. (Ver >= 1.19)

 


Recovery 的過程 (Algorithm)

 

Recovery 的過程一共有 6 個 Phase, 最終只有 Good ("+") / Bad Block ("-")

流程: non-tried -> non-trimmed -> non-scraped -> bad-sector

  • non-tried         # Size of the part of the rescue domain pending to be tried.
  • non-trimmed    # Size of the part of the rescue domain pending to be trimmed.
  • non-scraped     # Size of the part of the rescue domain pending to be scraped.

Phase 1 - Copying as much data as possible

Phase 1 的目的

Copying is done in up to 5 passes.

The purpose of the multiple passes is to delimit large bad areas fast,
 recover the most promising areas first, keep the mapfile small,
 and produce good starting points for trimming.

 * Only non-tried areas are read in large blocks.
    Trimming(P2), scraping(P3), and retrying(P4) are done sector by sector.

The first pass

It reads the non-tried parts of the input file,
 marking the failed blocks as non-trimmed and skipping beyond them.

Parameter: --cpass=1

Copying non-tried blocks... Pass 1 (forwards)

The second pass

runs in the opposite direction as the first pass and
 delimits the blocks skipped by the first pass.

 * The first two passes also skip beyond slow areas.
    The areas skipped are tried later in one or three additional passes (before trimming).

Parameter: --cpass=2

Copying non-tried blocks... Pass 2 (backwards)

The third and fourth passes

It read the blocks skipped due to slow areas (if any) by the first two passes,
in the same direction that each block was skipped.

For each block, skip the rest of the block after finding the first error in the block.

The last pass

It is a sweeping pass, with skipping disabled.

Phase 2 及 3 是一齊出現才有效

Phase 2 - Trimming

Trimming is done in one pass.

For each non-trimmed block, read forwards one sector at a time from the leading edge of the block until a bad sector is found.

Then read backwards one sector at a time from the trailing edge of the block until a bad sector is found.

Then mark the bad sectors found (if any) as bad-sector, and mark the rest of the block as non-scraped without trying to read it.

Then mark the bad sectors found (if any) as bad-sector,
and mark the rest of the block as non-scraped without trying to read it.

Phase 3 - Scraping

Scraping is done in one pass.

Scrape together the data not recovered by the copying or trimming phases.

Each non-scraped block is read forwards, one sector at a time. Any bad sectors found are marked as bad-sector.

Phase 4 - Retrying

Optionally try to read again the bad sectors until the number of retry passes specified is reached.
(Every bad sector is tried only once in each pass)

The direction is reversed after each pass

Phase 5(Optionally) - write a mapfile for later use

If the output file is a regular file created by ddrescue, the areas marked as bad-sector will contain zeros.
If it is a device or a previously existing file, the areas marked as bad-sector will still contain the data previously present there.

 


Disk Read Timeout (OS)

 

"dd" 係會使用系統的 timeout 及 eh_timeout 設定

timeout

# To control the command timer

/sys/block/<deviceName>/device/timeout                # SCSI default: 30

eh_timeout

# The timeout value for "TEST UNIT READY" and "REQUEST SENSE" commands used by the SCSI error handling code.

/sys/block/<deviceName>/device/eh_timeout           # Default: 10

修改

e.g.

echo 3 > /sys/block/sdf/device/timeout

echo 3 > /sys/block/sdf/device/eh_timeout

 


Install

 

apt-get install ddrescue

ddrescue -V

GNU ddrescue 1.16

 


Unit

 

s = sectors, k = 1000, Ki = 1024, M = 10^6, Mi = 2^20

 

 


Usage

 

ddrescue [options] infile outfile [logfile]

-i <bytes>                   # starting position in input file (Default 0)

-s <bytes>                  # maximum size of input data to be copied
                                  # 使用後, "pct rescued:" 是 "-s" 的份量

-b softbs                      # sector size of input device [Default 512]

-f, --force                     # 當目標是 Device 時就要加它

-A, --try-again             # mark non-split, non-trimmed blocks as non-tried
                                  # Try this if the drive stops responding and ddrescue immediately starts scraping failed blocks when restarted.

-M, --retrim                 # Mark all failed blocks inside the rescue domain as non-trimmed before beginning the rescue.

-p, --preallocate          # preallocate space on disc for output file
                                 # If preallocation succeeds, rescue will not fail due to lack of free space on disc.

-c, --cluster-size=       # Number of sectors to copy at a time. Defaults to "64 KiB / sector_size"

e.g.

-c 32 -b 4096    # 每 128 KiB 抄一次 (32 x 4k)

 


Output Info

 

e.g.

Current status
     ipos:  852214 MB, non-trimmed:    3653 kB,  current rate:    3276 B/s
     opos:  852214 MB, non-scraped:        0 B,  average rate:  43441 kB/s
non-tried:    1310 GB,  bad-sector:     8192 B,    error rate:       0 B/s
  rescued:  636641 MB,   bad areas:        2,        run time:  2h 58m 42s
pct rescued:   32.70%, read errors:       70,  remaining time:      5h 26m
 slow reads:      471,        time since last successful read:          0s
Copying non-tried blocks... Pass 1 (forwards)

ipos

    Input position. where data are being currently read from.

opos

    Output position. where data are being currently written to.

non-tried

    Size of the part of the rescue domain pending to be tried. This is the sum of the sizes of all the non-tried blocks.

    This is the sum of the sizes of all the non-trimmed, non-scraped, and bad-sector blocks.

bad-sector

    It increases during the trimming and scraping phases, and may decrease during the retrying phase.

    as ddrescue retries the bad-sector blocks, the good data found may divide them into smaller blocks,
    decreasing the total error size but increasing the number of bad areas.

bad areas

    Number of separate bad-sector blocks inside the rescue domain.

 


Basic Example

 

1) 將 sda3 變成 image

dd_rescue /dev/sda3 /media/backup/sda3.img ddrescue.log

2) Disk to Disk Clone

# sda -> sdz

# '-f'         Force overwrite of outfile.
               Needed when outfile is not a regular file, but a device or partition.

ddrescue -f -n -N /dev/sda /dev/sdz /root/ddrescue.log

# 用 zero 去清有問題的 Block

ddrescue -f /dev/zero /dev/sdz /root/ddrescue.log

2) zip 起它 / transfer to remote

ddrescue /dev/sda1 - | bzip2 > /dir/file.img.bz2

3) transfer to remote

ddrescue /dev/sda1 - |
  ssh user@remote "cat - > /remote/destination/file.img"

 


Rescue some key disc areas

 

Opts

'-i bytes' / '--input-position=bytes'

Starting position of the rescue domain in infile, in bytes. Defaults to 0.  

'-s bytes' / '--size=bytes'

Maximum size of the rescue domain, in bytes. It limits the amount of input data to be copied.

e.g.

應用: 分多 Part recovery

# 由 0 byte 位置開始 copy 200GiB

ddrescue -i -s200GiB /dev/sde sde.img ddrescue.log

# 由 30Gib 位置開始 copy 10GiB

ddrescue -i30GiB -s10GiB /dev/sde sde.img ddrescue.log

Notes

Unit 不支援小數點, 所以 1.7TiB 要寫成 1700GiB

 


Retry & Skip & Reopen

 

ddrescue -A --retrim -f -i0 -s300GiB  /dev/sdc /dev/sdd /home/logfile

'-A'   ('--try-again')

Mark all non-trimmed and non-scraped blocks inside the rescue domain as non-tried before beginning the rescue.

Try this if the drive stops responding and ddrescue immediately starts scraping failed blocks when restarted.

If '--retrim' is also specified, mark all failed blocks inside the rescue domain as non-tried.

'--retrim'

Mark all failed blocks inside the rescue domain as non-trimmed before beginning the rescue.

Skip

-n, --no-scrape                               

# skip the scraping phase
# Avoids spending a lot of time trying to rescue the most difficult parts of the file.

-N, --no-trim

# skip the trimming phase

Example: Recovery to image file

cd /home/recovery

# '-N'           skip the trimming phase
# '-n'            skip the scraping phase
# -b 4096     4k block

ddrescue -n -N -b 4096 /dev/sdf sdf.img ddrescue.log

# '-d'            use direct disc access for input file

ddrescue -d -r 3 -b 4096 /dev/sdf sdf.img ddrescue.log

Notes

# 查看 status

watch 'dmesg | tail -20'

retry-passes

'-r n' /  '--retry-passes=n'       # Exit after given number of retry passes. Defaults to 0 (-1=infinity)

使用後由

[696330.055065] sd 7:0:0:0: [sde] tag#0 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[696330.055069] sd 7:0:0:0: [sde] tag#0 Sense Key : Medium Error [current]
[696330.055072] sd 7:0:0:0: [sde] tag#0 Add. Sense: Unrecovered read error
[696330.055076] sd 7:0:0:0: [sde] tag#0 CDB: Read(10) 28 00 20 88 22 30 00 00 08 00
[696330.055078] print_req_error: critical medium error, dev sde, sector 545792560
[696330.055089] Buffer I/O error on dev sde, logical block 68224070, async page read
[696336.077706] sd 7:0:0:0: [sde] tag#0 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[696336.077711] sd 7:0:0:0: [sde] tag#0 Sense Key : Medium Error [current]
[696336.077714] sd 7:0:0:0: [sde] tag#0 Add. Sense: Unrecovered read error
[696336.077717] sd 7:0:0:0: [sde] tag#0 CDB: Read(10) 28 00 20 88 22 30 00 00 08 00
[696336.077720] print_req_error: critical medium error, dev sde, sector 545792560
[696336.077731] Buffer I/O error on dev sde, logical block 68224070, async page read

變成 (沒有了 "Buffer I/O error" 及重複一次 )

[696419.260665] sd 7:0:0:0: [sde] tag#0 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[696419.260669] sd 7:0:0:0: [sde] tag#0 Sense Key : Medium Error [current]
[696419.260672] sd 7:0:0:0: [sde] tag#0 Add. Sense: Unrecovered read error
[696419.260675] sd 7:0:0:0: [sde] tag#0 CDB: Read(10) 28 00 20 8a 97 00 00 00 80 00
[696419.260678] print_req_error: critical medium error, dev sde, sector 545953536

Reopen

-O, --reopen-on-error          reopen input file after every read error

 


Direct disc access (-d)

 

If you notice that the positions and sizes in the logfile are ALWAYS multiples of the sector size,

maybe your kernel is caching the disc accesses and grouping them.

In this case you may want to use direct disc access to bypass the kernel cache and rescue more of your data.

NOTE! Sector size must be correctly set with the '--sector-size' option for this to work.

# fast reading first
ddrescue -f -n /dev/hdb1 /dev/hdc1 logfile

# slower than normal cached reading
ddrescue -f -d -r3 /dev/hdb1 /dev/hdc1 logfile

# fix & mount
e2fsck -v -f /dev/hdc1
mount -t ext2 -o ro /dev/hdc1 /mnt

 


Change Passes

 

--cpass=range

  • 1
  • 1,2,3
  • 2-4

 


Logfile Structure

 

Example:

# Mapfile. Created by GNU ddrescue version 1.23
# Command line: ddrescue -A -n -d /dev/sdf3 dsk2.img dsk2.log
# Start time:   2023-11-03 17:09:35
# Current time: 2023-11-03 17:14:41
# Copying non-tried blocks... Pass 1 (forwards)
# current_pos  current_status  current_pass
0x368A14BA00     ?               1
#      pos        size  status
0x00000000  0x00200000  +
0x00200000  0x01CA0000  -

The first non-comment line is the status line.
(The status line allows ddrescue to resume the copying phase instead of restarting it from pass 1)

  • The first integer is the position being tried in the input file.
  • Status character
  • Current pass in the current phase.

Status character

Character     Meaning

  • '?'     copying non-tried blocks
  • '*'     trimming non-trimmed blocks
  • '/'     scraping non-scraped blocks
  • '-'     retrying bad sectors
  • 'F'     filling specified blocks
  • 'G'     generating approximate logfile
  • '+'     finished

The blocks in the list of data blocks

Character     Meaning

  • '?'     # non-tried block
  • '*'     # failed block non-trimmed
  • '/'     # failed block non-scraped
  • '-'     # failed block bad-sectors (bad blocks)
  • '+'    # finished blocks (good sectors)

 


Example: Recovery CD

 

# recovery

ddrescue -n -b2048 /dev/sr0 cd.iso log.txt

ddrescue -d -r 3 -b2048 /dev/sr0 cd.iso log.txt

remark

-b # sector size of input device [default 512]

在 CD 時一定係 -b2048

如果 errsize 係 0 咁果隻 CD 就救返了.

# Identifying Discs (iso image)

apt-get install genisoimage

isoinfo -d -i /dev/sr0

  • -d                     # Print information from the primary volume descriptor (PVD)  of  the  iso9660  image.
  • -i iso_image

# recovery content from CD

mount -o ro,loop -t iso9660 image.iso /mnt/mountpoint

 


Fill Mode (--fill)

 

if you use the "--fill" option, ddrescue does not rescue anything.

fill blocks of given types with data (?*/-+l)

Note that in fill mode the input file is always read from position 0.

In fill mode the input file may have any size.

If it is too small, the data will be duplicated as many times as necessary to fill the input buffer.

If it is too big, only the needed data will be read.

i.e.

# fills all areas marked as ‘-’ (bad hardware blocks) with copies of the string "BAD BLOCK"

echo -n "BaDB1K~!" > tmpfile

ddrescue --fill='-' tmpfile sde.img ddrescue.log
ddrescue --fill='*' tmpfile sde.img ddrescue.log
ddrescue --fill='/' tmpfile sde.img ddrescue.log
ddrescue --fill='?' tmpfile sde.img ddrescue.log

Figure out currupt file

1) Copy the damaged drive with ddrescue until finished.

    Do not use sparse writes. This yields a logfile with only finished (‘+’) and bad (‘-’) areas.

    -S, --sparse               # use sparse writes for output file

2) Mount the copied drive (or the image file, via loopback device).

3) Compute a md5sum or other checksum for every file.

    Build a list of all the files and their checksums.

4) Fill the bad areas of the copied drive or image file with a byte value different from zero.

5) Verify the checksums. Those files which have different checksums this time reside (at least partially) in damaged disk areas.

 


ddrescueview

 

HomePage: http://sourceforge.net/p/ddrescueview

它是一個將 mapfile 轉成 Graph 的顯示工具

 

 


Display Settings

 

-B, --binary-prefixes    # Default: SI prefixes (powers of 1000)
                                 # Show units with binary prefixes (powers of 1024)

 

 

 

Creative Commons license icon Creative Commons license icon