Cleanup & Check Harddisk Under LSI RAID三, 03/02/2021 - 10:12 的修訂版本

修訂版本可以讓你追蹤文章的多個版本的不同之處。

最後更新: 2020-02-03

 

 


SAS Disk 的健康指數

 

有關參數

  • Total uncorrected errors
  • Elements in grown defect list

Default: an empty grown defect list (or maybe up to 5 entries on just a few drives)

If the number is not zero => monitor the defect list for some time to see if it is still growing.

A steadily growing defect list is a good sign for the drive to fail in the near future.

 


檢查方式

 

1. 快速概覽

MegaCli64 -PDinfo -PhysDrv[252:2] -a0

...
Media Error Count: 0
Other Error Count: 0
Predictive Failure Count: 0

smartctl --scan

...
/dev/bus/0 -d megaraid,16 # /dev/bus/0 [megaraid_disk_16], SCSI device

smartctl -d megaraid,16 -a /dev/bus/0

smartctl -d megaraid,16 -t short /dev/bus/0

smartctl -d megaraid,16 -a /dev/bus/0

2. 詳細檢查

MegaCli64 -cfgldadd R0[252:2] WT NORA -a0

MegaCli64 -LDInfo -Lall -a0

smartctl -d megaraid,16 -t long /dev/bus/0

smartctl -d megaraid,16 -a /dev/bus/0

...
Self-test execution status:             89% of test remaining
SMART Self-test log
Num  Test              Status                 segment  LifeTime  LBA_first_err [SK ASC ASQ]
     Description                              number   (hours)
# 1  Background long   Self test in progress ...  64     NOW                 - [-   -    -]
# 2  Background short  Completed                  64   39758                 - [-   -    -]

Long (extended) Self-test duration: 5616 seconds [93.6 minutes]

LifeTime: 相當卡 "number of hours powered up"

Vendor (Seagate Cache) information
  Blocks sent to initiator = 2668630296                # 不會上升
  Blocks received from initiator = 1675807082          # 不會上升
  Blocks read from cache and sent to initiator = 4272401990

3. 清 Data

dmesg                                                                    # 找出正確的清的 Disk

dd if=/dev/zero of=/dev/sdX bs=32M oflag=direct     # "Blocks received from initiator" 會不斷上升

smartctl -d megaraid,16 -a /dev/bus/0

MegaCli64 -CfgLdDel -L1 -a0

Adapter 0: Deleted Virtual Drive-1(target id-1)

MegaCli64 -PDPrpRmv -PhysDrv[252:2] -a0

Prepare for removal Success

MegaCli64 -PDinfo -PhysDrv[252:2] -a0

...
Firmware state: Unconfigured(good), Spun down

 


Toubleshoot

 

smartctl -d megaraid,16 -a /dev/bus/0

...
SMART support is:     Unavailable - device lacks SMART capability.

=== START OF READ SMART DATA SECTION ===
Current Drive Temperature:     0 C
Drive Trip Temperature:        0 C

Error Counter logging not supported

Device does not support Self Test logging

原因: Firmware state: Unconfigured(good), Spun down

 => 為免 HDD Spun down, 我為要建立 R0