IBM - Bad Stripes event

IBM RAID Controller Bad Stripes event

An entry in the Bad Stripe Table(BST) indicates that the data contained in a stripe has been lost.

A single stripe unit failure is correctable and recoverable but two or more failures within the same redundant RAID stripe are not.

After an entry is logged in the BST, the controller will return an error code to the driver whenever the host system tries to access a Logical Block Address (LBA) within the affected stripe. This is one immediate indication that some part of the logical drive is unusable.

The Bad Stripe Table (BST) tracks stripes across a logical drive that contains invalid or incomplete data. There is a separate table for each logical drive.

After an entry is logged in the BST, the controller will return an error code to the driver whenever the host system tries to access a Logical Block Address (LBA) within the affected stripe. This is one immediate indication that some part of the logical drive is unusable.

What are Bad Stripe Table (BST) limitations?

ServeRAID firmware allows a maximum of 128 entries in the BST for a logical drive before blocking that logical drive. If 128 entries already exist in the BST for a logical drive in Rebuild state and another uncorrectable read error occurs such that the firmware would normally add this stripe to the BST, the rebuild will be halted and the Logical Drive will become blocked. The state shown for the drive that had been rebuilding will continue to show as rebuilding, but no rebuild activity will occur.

ServeRAID controller is designed to handle and correct a single stripe unit failure on a read or a write-verify. If two or more stripe units within the same horizontal stripe across the array fail at the same time for any reason, all stripe units within that stripe will become blocked, creating a bad stripe table entry in the hosted logical drives configuration. The error message "Multiple stripe unit failures within a single horizontal stripe" is a clear definition of a bad stripe at its most basic level.

=========================================

Understanding logical-drive synchronization

The purpose of synchronizing logical drives is to compute and write the parity data on the selected drives. Synchronizing a logical drive verifies that the data redundancy for the logical drive is correct.

For the ServeRAID-8i controller and ServeRAID-7t controller, the ServeRAID Manager also supports auto-synchronization for RAID level-1 and 10 logical drives.

------------

Data scrubbing

automatic background synchronization process. Data scrubbing keeps data "fresh" by doing the following:

    (For RAID level-5, 5E, 5EE, or 50) Reading data and rewriting the data parity.
    (For RAID level-1, 1E, 10, 1E0) Reading data and rewriting the mirror data.