41 users online | 41 Guests and 0 Registered

How can I check if RAID6 has Silent Data Corruption (SDC) but it's not RAID 6i?


If you read from a RAID 6 (localy or from a clients), check MD5 sums and get different results every time, probaly there are some silent data corruptions on disks of a your raid.

NOTE: Advanced Reconstruction must be enebled, if it's disabled you have some other touble.

1. SDC detection work only for riad stripes with real data so test you have SDC you must read a block which contains data. For example you can read files from a file system from clients.

2. Disable IO errors  if SDC is detected

echo 0 > /sys/devices/RAIDIX/RAIDIXdevice0/parameters/sdc_io_errorecho 0 > /sys/devices/RAIDIX/RAIDIXdevice1/parameters/sdc_io_error

 

3. Eneble debug log level

 

echo 2048 > /sys/module/raidixlib/parameters/log_level

4. Enable SDC detection

 

echo 1 > /sys/devices/RAIDIX/RAIDIXdevice0/parameters/verify_synd

 

5. Drop raid cache

 

echo 'h' > /sys/devices/RAIDIX/RAIDIXdevice0/stat

 

6. Read a file which has troubles with MD5 or read big block from a LUN localy with dd. If RAID stripes which contain this file of RAW block have SDC you will see a notification in Web_UI

 

7. Check logs 

 

#cat /var/log/messages | grep -i sdc | grep  "DN"Jul 25 18:31:02 genesis kernel: Raidix: In hr_stripe_report_sdc, line 2299: CPU 6 SN 3759,SSN 44,SSLDEC 2056040,SSLHEX 1f5f68,DN 2
Jul 25 18:31:02 genesis kernel: Raidix: In hr_stripe_report_sdc, line 2299: CPU 4 SN 3759,SSN 45,SSLDEC 2056048,SSLHEX 1f5f70,DN 2
Jul 25 18:31:02 genesis kernel: Raidix: In hr_stripe_report_sdc, line 2299: CPU 0 SN 3759,SSN 46,SSLDEC 2056056,SSLHEX 1f5f78,DN 2
 

8. DN 2 in this example means a disk on 2nd position has SDC

 

9. Check a "real" disk name

 

cat /sys/devices/RAIDIX/RAIDIXdevice0/disk10

10. Replace this drive.

 

 

Average rating:0 (0 Votes)

Login

Please enter your login name and password.

Add question

Ask your question below: