Data-Recovery
當兩個驅動器都出現故障時從 raid 1 恢復數據
我的伺服器上有一個 RAID 1,顯然兩個硬碟驅動器同時出現故障。
伺服器支持人員進行了快速檢查以確認
HDDTEST-W1F21M6K ERROR Finished (Selftest, Device: sda); HDDTEST-W1F22Y9M ERROR Finished (Values-Check, Device: sdb); However, there still seems to be a partition table on sdb. Your server is currently booted into our rescue system. Please try to backup your data if possible and contact us again if you wish to proceed with a hard drive replacement.
我可以從其他驅動器啟動系統並看到以下結構
cat /proc/mdstat Personalities : [raid1] md3 : active raid1 sdb4[1] 1822442815 blocks super 1.2 [2/1] [_U] md2 : active raid1 sdb3[1] 1073740664 blocks super 1.2 [2/1] [U_] md1 : active raid1 sdb2[1] 524276 blocks super 1.2 [1/1] [U] md0 : active raid1 sdb1[1] 33553336 blocks super 1.2 [2/1] [_U]
我需要的是能夠從
/dev/md2
分區中恢復一些重要數據。我正在嘗試掛載 md2 並獲得以下資訊:mount /dev/md2 /mnt mount: wrong fs type, bad option, bad superblock on /dev/md2, missing codepage or helper program, or other error In some cases useful info is found in syslog - try dmesg | tail or so
任何想法如何解決這一問題?
更新 1
更多數據
mdadm -E /dev/sdb3 /dev/sdb3: Magic : a92b4efc Version : 1.2 Feature Map : 0x0 Array UUID : 39c5b7f5:c3bed499:e383ce7f:0868fc3e Name : rescue:2 (local to host rescue) Creation Time : Wed Feb 6 07:23:32 2013 Raid Level : raid1 Raid Devices : 2 Avail Dev Size : 2147481600 (1024.00 GiB 1099.51 GB) Array Size : 1073740664 (1024.00 GiB 1099.51 GB) Used Dev Size : 2147481328 (1024.00 GiB 1099.51 GB) Data Offset : 2048 sectors Super Offset : 8 sectors State : clean Device UUID : 3d68ec1a:3b125641:fa4b1d34:c829f017 Update Time : Wed Aug 6 13:21:28 2014 Checksum : dad4eccc - correct Events : 18773099 Device Role : Active device 0 Array State : A. ('A' == active, '.' == missing)
更新 2
可用卷
ls /dev/sd sda sdb sdb1 sdb2 sdb3 sdb4 sdb5 mdadm -E /dev/sda mdadm: No md superblock detected on /dev/sda.
掛載 /dev/md2 /mnt attepmt 後的 dmesg 輸出
[Wed Aug 6 16:11:12 2014] ata2.00: exception Emask 0x0 SAct 0x600fffff SErr 0x0 action 0x0 [Wed Aug 6 16:11:12 2014] ata2.00: irq_stat 0x40000008 [Wed Aug 6 16:11:12 2014] ata2.00: cmd 60/08:e8:70:3b:d4/00:00:43:00:00/40 tag 29 ncq 4096 in [Wed Aug 6 16:11:12 2014] res 41/40:08:70:3b:d4/00:00:43:00:00/00 Emask 0x409 (media error) <F> [Wed Aug 6 16:11:12 2014] ata2.00: configured for UDMA/133 [Wed Aug 6 16:11:12 2014] sd 1:0:0:0: [sdb] Unhandled sense code [Wed Aug 6 16:11:12 2014] sd 1:0:0:0: [sdb] [Wed Aug 6 16:11:12 2014] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE [Wed Aug 6 16:11:12 2014] sd 1:0:0:0: [sdb] [Wed Aug 6 16:11:12 2014] Sense Key : Medium Error [current] [descriptor] [Wed Aug 6 16:11:12 2014] Descriptor sense data with sense descriptors (in hex): [Wed Aug 6 16:11:12 2014] 72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00 [Wed Aug 6 16:11:12 2014] 43 d4 3b 70 [Wed Aug 6 16:11:12 2014] sd 1:0:0:0: [sdb] [Wed Aug 6 16:11:12 2014] Add. Sense: Unrecovered read error - auto reallocate failed [Wed Aug 6 16:11:12 2014] sd 1:0:0:0: [sdb] CDB: [Wed Aug 6 16:11:12 2014] Read(16): 88 00 00 00 00 00 43 d4 3b 70 00 00 00 08 00 00 [Wed Aug 6 16:11:12 2014] end_request: I/O error, dev sdb, sector 1137982320 [Wed Aug 6 16:11:12 2014] ata2: EH complete [Wed Aug 6 16:11:15 2014] JBD2: Failed to read block at offset 1134 [Wed Aug 6 16:11:15 2014] JBD2: IO error -5 recovering block 1134 in log [Wed Aug 6 16:11:16 2014] JBD2: recovery failed [Wed Aug 6 16:11:16 2014] EXT4-fs (md2): error loading journal
更新 3
對於 sdb
smartctl -d ata -A /dev/sdb smartctl 5.41 2011-06-09 r3365 [x86_64-linux-3.14.10] (local build) Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net === START OF READ SMART DATA SECTION === SMART Attributes Data Structure revision number: 10 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000f 101 099 006 Pre-fail Always - 216425892 3 Spin_Up_Time 0x0003 095 095 000 Pre-fail Always - 0 4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 6 5 Reallocated_Sector_Ct 0x0033 092 092 010 Pre-fail Always - 10928 7 Seek_Error_Rate 0x000f 081 060 030 Pre-fail Always - 149168536 9 Power_On_Hours 0x0032 085 085 000 Old_age Always - 13145 10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0 12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 6 183 Runtime_Bad_Block 0x0032 099 099 000 Old_age Always - 1 184 End-to-End_Error 0x0032 100 100 099 Old_age Always - 0 187 Reported_Uncorrect 0x0032 064 064 000 Old_age Always - 36 188 Command_Timeout 0x0032 100 098 000 Old_age Always - 12885098499 189 High_Fly_Writes 0x003a 100 100 000 Old_age Always - 0 190 Airflow_Temperature_Cel 0x0022 067 052 045 Old_age Always - 33 (Min/Max 26/36) 191 G-Sense_Error_Rate 0x0032 100 100 000 Old_age Always - 0 192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 4 193 Load_Cycle_Count 0x0032 092 092 000 Old_age Always - 17084 194 Temperature_Celsius 0x0022 033 048 000 Old_age Always - 33 (0 22 0 0) 197 Current_Pending_Sector 0x0012 097 097 000 Old_age Always - 504 198 Offline_Uncorrectable 0x0010 097 097 000 Old_age Offline - 504 199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0 240 Head_Flying_Hours 0x0000 100 253 000 Old_age Offline - 128896263532923 241 Total_LBAs_Written 0x0000 100 253 000 Old_age Offline - 10152724077 242 Total_LBAs_Read 0x0000 100 253 000 Old_age Offline - 40689314539
對於 sda
smartctl -d ata -A /dev/sda smartctl 5.41 2011-06-09 r3365 [x86_64-linux-3.14.10] (local build) Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net Error SMART Values Read failed: Input/output error Smartctl: SMART Read Values failed. === START OF READ SMART DATA SECTION ===
好的,它看起來
/dev/sda
已經很死了,你不會從中獲取數據,至少不會沒有技巧。
/dev/sdb
另一方面,似乎有很多壞扇區。這可能是一個不好的跡象,但您應該能夠關閉您的數據。根據數據的重要性以及您對備份的信心,您需要首先對磁碟進行映像,至少是可以讀取的扇區。工具包括 GNU ddrescue 和一些類似的程序。
然後fsck。例如,
fsck /dev/md2
在實時系統上進行。您可以-p
先嘗試自動修復錯誤,因為它非常確定風險很小,或者-y
告訴它修復所有內容(即使它有風險)。或者沒有任何選項,它會提示你做每一件事。在那之後,您應該能夠掛載
/dev/md2
並獲取您的數據,或者至少是它的剩餘部分。我會要求您的託管公司將兩個故障磁碟保留一點(更換磁碟後),直到您確定您擁有所有數據。