Patrol Read & Consistency Checking

最後更新: 2015-12-14

 

介紹

在 LSI 的 RAID Card 上, 它一共有兩種 consistency checks

- consistency checks

- patrol reads

 * MegaRAID cards is to do both a patrol read and consistency check at least every 168 hours (7 days).

 * Consistency Checking is different from Patrol Read

 


查看 Adapter properties

 

MegaCli64 -AdpAllInfo -aALL | grep -i -e patrol -e consistency

Check Consistency Rate           : 30%
Patrol Read Rate                : Yes
Enable SSD Patrol Read                  : No

P.S.

較舊的 firmware 是沒有 "consistency checks" 的

 


Patrol read

 

功用

Tries to discover disk error before it is too late and data is lost.

(including hot spare connected to a controller.)

原理

This process causes the drives to read the data by issuing "read-verify" commands.

By using the "read-verify " command, the data from the drives is not transferred to the MegaRAID adapter unless an error is detected and

reported by one or more drives included in the stripe.

If a single drive reports an error within the stripe, the read patrol function initiates read commands to all the other stripe unit drives and

the data for this single failing stripe unit is recreated by the MegaRAID adapter from the remaining data and parity stripe units.

After recreating this data, the adapter then issues a write-verify command to the drive that reported the error on the read-verify command and

writes this recreated portion of the stripe to that drive.

After this write completes successfully, this is now a known good stripe, and read patrol can continue with the next stripe.

In the event that two or more drives report errors during the read-verify portion of the read patrol,

the failing stripe will be added to the Bad Stripe Table.

 * delay of 168 hours between different patrol reads

 * 30% of IO resources

 * including hot spare connected to a controller

 * Patrol read starts only when the controller is idle for a defined period of time and no other background tasks are active

(1) Patrol read setting

MegaCli64 -AdpPR -Info -aAll

Adapter 0: Patrol Read Information:

Patrol Read Mode: Auto
Patrol Read Execution Delay: 168 hours         <-- 7 days
Number of iterations completed: 92
Current State: Stopped
Patrol Read on SSD Devices: Disabled

Remark

By default it is done automatically (with a delay of 168 hours between different patrol reads)
and will take up to 30% of IO resources.

(2) Patrol Read Rate

MegaCli64 -AdpGetProp PatrolReadRate -aALL

Adapter 0: Patrol Read Rate = 30%

P.S.

# 設定成 10%

MegaCli64 -AdpSetProp PatrolReadRate 10 -aALL

(3) Enable & Disable automatic patrol read

# To enable automatic patrol read:

MegaCli64 -AdpPR -EnblAuto -aALL

# To disable automatic patrol read:

MegaCli64 -AdpPR -Dsbl -aALL

# Enables the patrol read manually for the selected controllers

# This means that the patrol read does not start automatically

-EnblMan

(4) Manual patrol read scan

# Start:

MegaCli64 -AdpPR -Start -aALL

# Stop:

MegaCli64 -AdpPR -Stop -aALL

# 進行 patrol read 時的 status:

MegaCli64 -AdpPR -Info -aALL

Patrol Read Mode: Auto
Patrol Read Execution Delay: 168 hours
Number of iterations completed: 92
Current State: Active
Adapter 0: Number of PDs completed: 0
Patrol Read on SSD Devices: Disabled

# Suspend|Resume

(5) To correct media error during patrol read

# Get setting

MegaCli64 -AdpGetProp PrCorrectUncfgdAreas -aALL

Adapter 0: PR Correct Unconfigured Areas: Enabled

# Modify Setting

MegaCli -AdpSetProp -PrCorrectUncfgdAreas -1 -aALL

 


Consistency check

 

In a system with parity, checking consistency means computing the data on one drive and

comparing the results to the contents of the parity drive.

 * Not valid to RAID0

# When the next consistency check is scheduled

MegaCli64 -AdpCcSched -Info -aALL

    Adapter #0

    Operation Mode: Disabled
    Execution Delay: 168
    Next start time: 12/14/2013, 03:00:00
    Current State: Stopped
    Number of iterations: 0
    Number of VD completed: 0
    Excluded VDs          : None
    Exit Code: 0x00

# Consistency Check Rate(CCRate)

# Get

MegaCli64 -AdpGetProp CCRate -aALL

# Set

MegaCli64 -AdpSetProp CCRate 10 -aALL

# Scheduled task is set to run

MegaCli64 -AdpCCSched -SetSTartTime yymmdd hh -aALL

# Mode: Disabled | Concurrent | Sequencial

MegaCli64 -AdpCcSched -ModeConc -aALL                     <-- 修改 "Operation Mode:"

MegaCli64 -AdpCcSched -ModeSeq -aALL

MegaCli64 -AdpCcSched -Dsbl -aALL

Remark

ModeConc: The scheduled CC on all of the virtual drives runs concurrently for the given adapter(s).

ModeSeq: The scheduled CC on all of the virtual drives runs sequentially for the given adapter(s).

# 人手行

MegaCli64 -LDCC -Start|-Abort|-ShowProg|-ProgDsply -LALL -aALL

 

# Show

MegaCli64 -LDCC -ShowProg -LALL -aALL

-ShowProg: Displays a snapshot of an ongoing CC.

Check Consistency on VD #0 (target id #0) Completed 2% in 7 Minutes.

MegaCli64 -LDCC -ProgDsply -LALL -aALL

-ProgDsply: Displays ongoing CC progress. The progress displays until at least one CC is completed or a key is pressed.

 Progress of Virtual Drives...

  Virtual Drive #              Percent Complete                       Time Elps
          0         #                      02 %                        00:07:59

    Press <ESC> key to quit...

 


Script: chk_raid_health.sh

 

#!/bin/sh

MyServer="Server"
Admin="x@x"
MegaCli="/opt/MegaRAID/MegaCli/MegaCli64"
Log="/tmp/raid_status.txt"

$MegaCli PDList -aALL | grep -e ^Slot -e ^Firmware -e ^Inquiry > $Log

echo "" | mail -s"$MyServer RAID status" -a $Log $Admin