恢復軟體 RAID5 數據
幾天前我發現我的 RAID 5 相關的分區沒有掛載。我檢查了我的磁碟,我得到了:
mdadm –examine /dev/sd{a,b,c,d}1
mdadm: No md superblock detected on /dev/sda1. /dev/sdb1: Magic : a92b4efc Version : 1.2 Feature Map : 0x0 Array UUID : 87fdc598:a995d0f7:41123bcf:e2760aeb Name : itake:0 (local to host itake) Creation Time : Tue Aug 28 17:44:52 2012 Raid Level : raid5 Raid Devices : 4 Avail Dev Size : 1953122952 (931.32 GiB 1000.00 GB) Array Size : 2929683456 (2793.96 GiB 3000.00 GB) Used Dev Size : 1953122304 (931.32 GiB 1000.00 GB) Data Offset : 2048 sectors Super Offset : 8 sectors Unused Space : before=1968 sectors, after=648 sectors State : clean Device UUID : db15e0ad:ef9f28be:de5e5a5a:f929ebb9 Update Time : Sun Sep 11 00:00:26 2016 Checksum : 700e7a14 - correct Events : 6141 Layout : left-symmetric Chunk Size : 512K Device Role : Active device 1 Array State : AA.A ('A' == active, '.' == missing, 'R' == replacing) /dev/sdc1: Magic : a92b4efc Version : 1.2 Feature Map : 0x0 Array UUID : 87fdc598:a995d0f7:41123bcf:e2760aeb Name : itake:0 (local to host itake) Creation Time : Tue Aug 28 17:44:52 2012 Raid Level : raid5 Raid Devices : 4 Avail Dev Size : 1953122952 (931.32 GiB 1000.00 GB) Array Size : 2929683456 (2793.96 GiB 3000.00 GB) Used Dev Size : 1953122304 (931.32 GiB 1000.00 GB) Data Offset : 2048 sectors Super Offset : 8 sectors Unused Space : before=1968 sectors, after=648 sectors State : clean Device UUID : f3c74ca8:076e5078:305ad83b:159f048d Update Time : Sun Sep 4 00:08:53 2016 Checksum : d9306794 - correct Events : 5896 Layout : left-symmetric Chunk Size : 512K Device Role : Active device 2 Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing) /dev/sdd1: Magic : a92b4efc Version : 1.2 Feature Map : 0x0 Array UUID : 87fdc598:a995d0f7:41123bcf:e2760aeb Name : itake:0 (local to host itake) Creation Time : Tue Aug 28 17:44:52 2012 Raid Level : raid5 Raid Devices : 4 Avail Dev Size : 1953122952 (931.32 GiB 1000.00 GB) Array Size : 2929683456 (2793.96 GiB 3000.00 GB) Used Dev Size : 1953122304 (931.32 GiB 1000.00 GB) Data Offset : 2048 sectors Super Offset : 8 sectors Unused Space : before=1968 sectors, after=648 sectors State : clean Device UUID : 4b376772:29ca4f41:342d39df:877fece0 Update Time : Sun Sep 11 00:00:26 2016 Checksum : 639ce9a5 - correct Events : 6141 Layout : left-symmetric Chunk Size : 512K Device Role : Active device 3 Array State : AA.A ('A' == active, '.' == missing, 'R' == replacing)
所以我認為只有
sda
損壞了,我才能恢復資訊。我買了一張新磁碟,現在我發現它sda
不僅如此,而且sdc
還如此。我想知道是否有機會嘗試修復兩個磁碟中的任何一個以嘗試在替換兩個磁碟之前重新創建和恢復資訊…我在這裡留下了一些有關錯誤的有用資訊,如果您需要其他資訊,請告訴我
貓 /proc/mdstat
Personalities : [raid6] [raid5] [raid4] unused devices: <none>
cat / var / log / syslog
Nov 11 09:16:59 itake kernel: [ 18.230695] sd 0:0:0:0: [sda] Unhandled sense code Nov 11 09:16:59 itake kernel: [ 18.230698] sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE Nov 11 09:16:59 itake kernel: [ 18.230703] sd 0:0:0:0: [sda] Sense Key : Medium Error [current] [descriptor] Nov 11 09:16:59 itake kernel: [ 18.230708] Descriptor sense data with sense descriptors (in hex): Nov 11 09:16:59 itake kernel: [ 18.230711] 72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00 Nov 11 09:16:59 itake kernel: [ 18.230721] 00 00 08 08 Nov 11 09:16:59 itake kernel: [ 18.230726] sd 0:0:0:0: [sda] Add. Sense: Unrecovered read error - auto reallocate failed Nov 11 09:16:59 itake kernel: [ 18.230734] sd 0:0:0:0: [sda] CDB: Read(10): 28 00 00 00 08 08 00 00 08 00 Nov 11 09:16:59 itake kernel: [ 18.230744] end_request: I/O error, dev sda, sector 2056 Nov 11 09:16:59 itake kernel: [ 18.230796] Buffer I/O error on device sda1, logical block 1 Nov 11 09:16:59 itake kernel: [ 18.230881] ata1: EH complete ... Nov 11 09:16:59 itake kernel: [ 104.221334] sd 2:0:0:0: [sdc] Unhandled sense code Nov 11 09:16:59 itake kernel: [ 104.221337] sd 2:0:0:0: [sdc] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE Nov 11 09:16:59 itake kernel: [ 104.221342] sd 2:0:0:0: [sdc] Sense Key : Medium Error [current] [descriptor] Nov 11 09:16:59 itake kernel: [ 104.221347] Descriptor sense data with sense descriptors (in hex): Nov 11 09:16:59 itake kernel: [ 104.221350] 72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00 Nov 11 09:16:59 itake kernel: [ 104.221360] 74 70 68 00 Nov 11 09:16:59 itake kernel: [ 104.221365] sd 2:0:0:0: [sdc] Add. Sense: Unrecovered read error - auto reallocate failed Nov 11 09:16:59 itake kernel: [ 104.221372] sd 2:0:0:0: [sdc] CDB: Read(10): 28 00 74 70 68 00 00 01 d0 00 Nov 11 09:16:59 itake kernel: [ 104.221381] end_request: I/O error, dev sdc, sector 1953523712 Nov 11 09:16:59 itake kernel: [ 104.221431] Buffer I/O error on device sdc, logical block 244190464 Nov 11 09:16:59 itake kernel: [ 104.221482] Buffer I/O error on device sdc, logical block 244190465 Nov 11 09:16:59 itake kernel: [ 104.221524] Buffer I/O error on device sdc, logical block 244190466 Nov 11 09:16:59 itake kernel: [ 104.221567] Buffer I/O error on device sdc, logical block 244190467 Nov 11 09:16:59 itake kernel: [ 104.221608] Buffer I/O error on device sdc, logical block 244190468 Nov 11 09:16:59 itake kernel: [ 104.221649] Buffer I/O error on device sdc, logical block 244190469 Nov 11 09:16:59 itake kernel: [ 104.221690] Buffer I/O error on device sdc, logical block 244190470 Nov 11 09:16:59 itake kernel: [ 104.221731] Buffer I/O error on device sdc, logical block 244190471 Nov 11 09:16:59 itake kernel: [ 104.221772] Buffer I/O error on device sdc, logical block 244190472 Nov 11 09:16:59 itake kernel: [ 104.221813] Buffer I/O error on device sdc, logical block 244190473 Nov 11 09:16:59 itake kernel: [ 104.221897] ata3: EH complete Nov 11 09:16:59 itake kernel: [ 107.652344] ata3.00: exception Emask 0x0 SAct 0x7 SErr 0x0 action 0x0 Nov 11 09:16:59 itake kernel: [ 107.652389] ata3.00: irq_stat 0x40000008 Nov 11 09:16:59 itake kernel: [ 107.652429] ata3.00: failed command: READ FPDMA QUEUED Nov 11 09:16:59 itake kernel: [ 107.652474] ata3.00: cmd 60/08:00:d8:69:70/00:00:74:00:00/40 tag 0 ncq 4096 in Nov 11 09:16:59 itake kernel: [ 107.652476] res 41/40:00:d8:69:70/00:00:74:00:00/40 Emask 0x409 (media error) <F> Nov 11 09:16:59 itake kernel: [ 107.652559] ata3.00: status: { DRDY ERR } Nov 11 09:16:59 itake kernel: [ 107.652598] ata3.00: error: { UNC } Nov 11 09:16:59 itake kernel: [ 107.654733] ata3.00: configured for UDMA/133 Nov 11 09:16:59 itake kernel: [ 107.654754] ata3: EH complete ... Nov 11 09:16:59 itake kernel: [ 137.768972] sd 0:0:0:0: [sda] Add. Sense: Unrecovered read error - auto reallocate failed Nov 11 09:16:59 itake kernel: [ 137.768979] sd 0:0:0:0: [sda] CDB: Read(10): 28 00 00 00 08 08 00 00 08 00 Nov 11 09:16:59 itake kernel: [ 137.768989] end_request: I/O error, dev sda, sector 2056 Nov 11 09:16:59 itake kernel: [ 137.769067] ata1: EH complete Nov 11 09:16:59 itake kernel: [ 137.779624] md: md0 stopped. Nov 11 09:16:59 itake kernel: [ 138.630989] ata3.00: exception Emask 0x0 SAct 0x3 SErr 0x0 action 0x0 Nov 11 09:16:59 itake kernel: [ 138.631035] ata3.00: irq_stat 0x40000008 Nov 11 09:16:59 itake kernel: [ 138.631076] ata3.00: failed command: READ FPDMA QUEUED Nov 11 09:16:59 itake kernel: [ 138.631121] ata3.00: cmd 60/08:00:40:68:70/00:00:74:00:00/40 tag 0 ncq 4096 in Nov 11 09:16:59 itake kernel: [ 138.631123] res 41/40:00:40:68:70/00:00:74:00:00/40 Emask 0x409 (media error) <F> Nov 11 09:16:59 itake kernel: [ 138.631206] ata3.00: status: { DRDY ERR } Nov 11 09:16:59 itake kernel: [ 138.631245] ata3.00: error: { UNC } Nov 11 09:16:59 itake kernel: [ 138.633417] ata3.00: configured for UDMA/133 Nov 11 09:16:59 itake kernel: [ 138.633443] ata3: EH complete Nov 11 09:16:59 itake kernel: [ 138.637684] md: bind<sdc1> Nov 11 09:16:59 itake kernel: [ 138.637896] md: bind<sdd1> Nov 11 09:16:59 itake kernel: [ 138.638139] md: bind<sdb1> Nov 11 09:16:59 itake kernel: [ 138.638173] md: kicking non-fresh sdc1 from array! Nov 11 09:16:59 itake kernel: [ 138.638180] md: unbind<sdc1> Nov 11 09:16:59 itake kernel: [ 138.640178] md: export_rdev(sdc1) Nov 11 09:16:59 itake kernel: [ 138.640178] md: export_rdev(sdc1) Nov 11 09:16:59 itake kernel: [ 138.708065] raid6: int64x1 1355 MB/s Nov 11 09:16:59 itake kernel: [ 138.776062] raid6: int64x2 1504 MB/s Nov 11 09:16:59 itake kernel: [ 138.844062] raid6: int64x4 1284 MB/s Nov 11 09:16:59 itake kernel: [ 138.912061] raid6: int64x8 1109 MB/s Nov 11 09:16:59 itake kernel: [ 138.980085] raid6: sse2x1 2124 MB/s Nov 11 09:16:59 itake kernel: [ 139.048065] raid6: sse2x2 3413 MB/s Nov 11 09:16:59 itake kernel: [ 139.116061] raid6: sse2x4 4022 MB/s Nov 11 09:16:59 itake kernel: [ 139.116064] raid6: using algorithm sse2x4 (4022 MB/s) Nov 11 09:16:59 itake kernel: [ 139.116302] async_tx: api initialized (async) Nov 11 09:16:59 itake kernel: [ 139.116471] xor: automatically using best checksumming function: generic_sse Nov 11 09:16:59 itake kernel: [ 139.136056] generic_sse: 6183.000 MB/sec Nov 11 09:16:59 itake kernel: [ 139.136059] xor: using function: generic_sse (6183.000 MB/sec) Nov 11 09:16:59 itake kernel: [ 139.137667] md: raid6 personality registered for level 6 Nov 11 09:16:59 itake kernel: [ 139.137671] md: raid5 personality registered for level 5 Nov 11 09:16:59 itake kernel: [ 139.137674] md: raid4 personality registered for level 4 Nov 11 09:16:59 itake kernel: [ 139.137936] bio: create slab <bio-1> at 1 Nov 11 09:16:59 itake kernel: [ 139.137960] md/raid:md0: device sdb1 operational as raid disk 1 Nov 11 09:16:59 itake kernel: [ 139.137964] md/raid:md0: device sdd1 operational as raid disk 3 Nov 11 09:16:59 itake kernel: [ 139.138427] md/raid:md0: allocated 4280kB Nov 11 09:16:59 itake kernel: [ 139.138551] md/raid:md0: not enough operational devices (2/4 failed) Nov 11 09:16:59 itake kernel: [ 139.138628] RAID conf printout: Nov 11 09:16:59 itake kernel: [ 139.138630] --- level:5 rd:4 wd:2 Nov 11 09:16:59 itake kernel: [ 139.138634] disk 1, o:1, dev:sdb1 Nov 11 09:16:59 itake kernel: [ 139.138637] disk 3, o:1, dev:sdd1 Nov 11 09:16:59 itake kernel: [ 139.139106] md/raid:md0: failed to run raid set. Nov 11 09:16:59 itake kernel: [ 139.139146] md: pers->run() failed ... Nov 11 09:16:59 itake kernel: [ 139.139523] md: md0 stopped. Nov 11 09:16:59 itake kernel: [ 139.139532] md: unbind<sdb1> Nov 11 09:16:59 itake kernel: [ 139.156130] md: export_rdev(sdb1) Nov 11 09:16:59 itake kernel: [ 139.156158] md: unbind<sdd1> Nov 11 09:16:59 itake kernel: [ 139.168118] md: export_rdev(sdd1) ...
smartctl -a /dev/sda
smartctl 6.5 2016-01-24 r4214 [x86_64-linux-3.2.0-4-amd64] (local build) Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Model Family: SAMSUNG SpinPoint F3 Device Model: SAMSUNG HD103SJ Serial Number: S246J1KZ410348 LU WWN Device Id: 5 0024e9 0034ebb37 Firmware Version: 1AJ10001 User Capacity: 1,000,203,804,160 bytes [1.00 TB] Sector Size: 512 bytes logical/physical Rotation Rate: 7200 rpm Form Factor: 3.5 inches Device is: In smartctl database [for details use: -P show] ATA Version is: ATA8-ACS T13/1699-D revision 6 SATA Version is: SATA 2.6, 3.0 Gb/s Local Time is: Fri Nov 11 11:17:00 2016 CET SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x00) Offline data collection activity was never started. Auto Offline Data Collection: Disabled. Self-test execution status: ( 121) The previous self-test completed having the read element of the test failed. Total time to complete Offline data collection: ( 9300) seconds. Offline data collection capabilities: (0x5b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. No Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 2) minutes. Extended self-test routine recommended polling time: ( 155) minutes. SCT capabilities: (0x003f) SCT Status supported. SCT Error Recovery Control supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 100 100 051 Pre-fail Always - 784 2 Throughput_Performance 0x0026 252 252 000 Old_age Always - 0 3 Spin_Up_Time 0x0023 070 069 025 Pre-fail Always - 9385 4 Start_Stop_Count 0x0032 099 099 000 Old_age Always - 1225 5 Reallocated_Sector_Ct 0x0033 252 252 010 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 252 252 051 Old_age Always - 0 8 Seek_Time_Performance 0x0024 252 252 015 Old_age Offline - 0 9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 30077 10 Spin_Retry_Count 0x0032 252 252 051 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 252 252 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 212 191 G-Sense_Error_Rate 0x0022 252 252 000 Old_age Always - 0 192 Power-Off_Retract_Count 0x0022 252 252 000 Old_age Always - 0 194 Temperature_Celsius 0x0002 064 052 000 Old_age Always - 21 (Min/Max 11/48) 195 Hardware_ECC_Recovered 0x003a 100 100 000 Old_age Always - 0 196 Reallocated_Event_Count 0x0032 252 252 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 100 100 000 Old_age Always - 2 198 Offline_Uncorrectable 0x0030 252 252 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0036 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x002a 100 100 000 Old_age Always - 64 223 Load_Retry_Count 0x0032 252 252 000 Old_age Always - 0 225 Load_Cycle_Count 0x0032 100 100 000 Old_age Always - 1244 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed: read failure 90% 30077 2056 # 2 Extended offline Completed: read failure 90% 30077 2056 SMART Selective self-test log data structure revision number 0 Note: revision number not 1 implies that no selective self-test has ever been run SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Completed_read_failure [90% left] (0-65535) 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay.
smartctl -a /dev/sdc
smartctl 6.5 2016-01-24 r4214 [x86_64-linux-3.2.0-4-amd64] (local build) Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Model Family: Western Digital Green Device Model: WDC WD10EZRX-00A8LB0 Serial Number: WD-WMC1U5433779 LU WWN Device Id: 5 0014ee 657f09173 Firmware Version: 01.01A01 User Capacity: 1,000,204,886,016 bytes [1.00 TB] Sector Sizes: 512 bytes logical, 4096 bytes physical Device is: In smartctl database [for details use: -P show] ATA Version is: ATA8-ACS (minor revision not indicated) SATA Version is: SATA 3.0, 6.0 Gb/s (current: 3.0 Gb/s) Local Time is: Fri Nov 11 11:17:08 2016 CET SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x85) Offline data collection activity was aborted by an interrupting command from host. Auto Offline Data Collection: Enabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: (12960) seconds. Offline data collection capabilities: (0x7b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 2) minutes. Extended self-test routine recommended polling time: ( 148) minutes. Conveyance self-test routine recommended polling time: ( 5) minutes. SCT capabilities: (0x30b5) SCT Status supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 6787 3 Spin_Up_Time 0x0027 139 137 021 Pre-fail Always - 4041 4 Start_Stop_Count 0x0032 096 096 000 Old_age Always - 4454 5 Reallocated_Sector_Ct 0x0033 171 171 140 Pre-fail Always - 1262 7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0 9 Power_On_Hours 0x0032 054 054 000 Old_age Always - 33859 10 Spin_Retry_Count 0x0032 100 100 000 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 100 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 145 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 31 193 Load_Cycle_Count 0x0032 170 170 000 Old_age Always - 90973 194 Temperature_Celsius 0x0022 119 104 000 Old_age Always - 24 196 Reallocated_Event_Count 0x0032 001 001 000 Old_age Always - 819 197 Current_Pending_Sector 0x0032 198 198 000 Old_age Always - 326 198 Offline_Uncorrectable 0x0030 200 200 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 80 SMART Error Log Version: 1 ATA Error Count: 11 (device log contains only the most recent five errors) CR = Command Register [HEX] FR = Features Register [HEX] SC = Sector Count Register [HEX] SN = Sector Number Register [HEX] CL = Cylinder Low Register [HEX] CH = Cylinder High Register [HEX] DH = Device/Head Register [HEX] DC = Device Command Register [HEX] ER = Error register [HEX] ST = Status register [HEX] Powered_Up_Time is measured from power on, and printed as DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes, SS=sec, and sss=millisec. It "wraps" after 49.710 days. Error 11 occurred at disk power-on lifetime: 33842 hours (1410 days + 2 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 60 b0 a8 d7 e1 Error: UNC 96 sectors at LBA = 0x01d7a8b0 = 30910640 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- c8 00 60 a0 a8 d7 e1 08 17d+13:32:19.391 READ DMA ef 10 02 00 00 00 a0 08 17d+13:32:19.391 SET FEATURES [Enable SATA feature] ec 00 00 00 00 00 a0 08 17d+13:32:19.390 IDENTIFY DEVICE ef 03 46 00 00 00 a0 08 17d+13:32:19.390 SET FEATURES [Set transfer mode] Error 10 occurred at disk power-on lifetime: 33842 hours (1410 days + 2 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 60 c8 a8 d7 e1 Error: UNC 96 sectors at LBA = 0x01d7a8c8 = 30910664 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- c8 00 60 a0 a8 d7 e1 08 17d+13:32:11.498 READ DMA ef 10 02 00 00 00 a0 08 17d+13:32:11.498 SET FEATURES [Enable SATA feature] ec 00 00 00 00 00 a0 08 17d+13:32:11.497 IDENTIFY DEVICE ef 03 46 00 00 00 a0 08 17d+13:32:11.497 SET FEATURES [Set transfer mode] Error 9 occurred at disk power-on lifetime: 33842 hours (1410 days + 2 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 60 a8 a8 d7 e1 Error: UNC 96 sectors at LBA = 0x01d7a8a8 = 30910632 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- c8 00 60 a0 a8 d7 e1 08 17d+13:32:07.412 READ DMA c8 00 08 98 a8 d7 e1 08 17d+13:32:07.412 READ DMA Error 8 occurred at disk power-on lifetime: 33841 hours (1410 days + 1 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 08 48 f9 80 e1 Error: UNC 8 sectors at LBA = 0x0180f948 = 25229640 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- c8 00 08 48 f9 80 e1 08 17d+13:12:31.938 READ DMA ef 10 02 00 00 00 a0 08 17d+13:12:31.938 SET FEATURES [Enable SATA feature] ec 00 00 00 00 00 a0 08 17d+13:12:31.937 IDENTIFY DEVICE ef 03 46 00 00 00 a0 08 17d+13:12:31.937 SET FEATURES [Set transfer mode] Error 7 occurred at disk power-on lifetime: 33841 hours (1410 days + 1 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 08 48 f9 80 e1 Error: UNC 8 sectors at LBA = 0x0180f948 = 25229640 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- c8 00 08 48 f9 80 e1 08 17d+13:12:29.429 READ DMA c8 00 b8 48 f5 80 e1 08 17d+13:12:29.410 READ DMA SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Extended offline Completed: read failure 90% 33858 384848 # 2 Short offline Completed: read failure 80% 33857 1951525160 SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay.
編輯
我使用以下方法複製
sda
磁碟:ddrescue -d -f -r3 /dev/sda /dev/sde ddrescue.logfile
GNU ddrescue 1.19 Initial status (read from logfile) rescued: 1000 GB, errsize: 4096 B, errors: 3 Current status rescued: 1000 GB, errsize: 4096 B, current rate: 0 B/s ipos: 1052 kB, errors: 3, average rate: 0 B/s opos: 1052 kB, run time: 3.13 m, successful read: 3.13 m ago
但是複製的磁碟
sde
也說:mdadm: No md superblock detected on /dev/sde1.
我可以恢復那些超級塊嗎?
我們需要將您的問題一分為二:
- 為失去的超級塊重建陣列
/dev/sda
。- 處理損壞的數據
/dev/sdc
重建陣列
我假設未從 /dev/sda 恢復的 4kB 塊是超級塊,因為您的分區開始於 1MiB(2048 個扇區),而超級塊開始於分區的 +8 個扇區(2056 個扇區),正好在 2056 扇區標記壞扇區存在的地方。假設該驅動器上的數據 REST 是 100% 完整的。
一般來說,問題
--assume-clean
在於您需要非常小心,確保參數與創建數組時使用的參數完全匹配。自您的數組創建日期以來預設值的更改Tue Aug 28 17:44:52 2012
是您的敵人。元數據版本、點陣圖假設(大於 100GB 的設備現在自動獲取)、raid 佈局等。如果您不確定,我強烈建議將所有 4 個驅動器複製到其他驅動器(在絕對緊要關頭,您甚至可以使用單個驅動器,每個驅動器具有 4 個分區,每個 1TB/1953122952 扇區),並嘗試重新組裝反而。就像
/dev/sda
開始出現故障一樣/dev/sdc
,您可能同時購買了所有驅動器,甚至可能是兩個完整的數據副本,以防您在嘗試恢復時遇到進一步的驅動器故障(取決於您認為您的價值有多大)數據)。如果假設清理適用於測試副本,那麼您可以將其移動到其他地方。
這是為您提供的一些腳本幫助,提煉您上面給出的內容(請注意,
--examine
某些值以扇區為單位報告,但--create
以 KiB 為單位)。#!/bin/bash # Facts we know about your array, from your mdadm -E output. num_devices=4 num_spares=0 chunk_size=512 # KiB data_offset=1024 # KiB drive_size=976561152 # KiB uuid=87fdc598:a995d0f7:41123bcf:e2760aeb metadata_ver=1.2 name=itake:0 bitmap=none original_device_order=(/dev/sda1 /dev/sdb1 /dev/sdc1 /dev/sdd1) # I assume you recovered: # /dev/sda to /dev/sde # /dev/sdc to /dev/sdf # If not, adjust as needed. # # Read all the way to the bottom of my response first, because you # MIGHT want to use 'missing' in place of /dev/sdf1 initially. new_device_order=(/dev/sde1 /dev/sdb1 /dev/sdf1 /dev/sdd1) mdadm --create /dev/md0 \ --level 5 --layout left-symmetric -n $num_devices -x $num_spares \ --uuid $uuid \ --metadata $metadata_ver \ --chunk $chunk_size --size $drive_size \ --data-offset $data_offset \ --name $name \ --bitmap $bitmap \ ${new_device_order[@]}
擦洗陣列(選項 1)
我們知道
/dev/sdc
您的數據區域中間有壞扇區,並且作為一個更核心的選項,您可以missing
在陣列創建期間代替/dev/sdf1
,然後mdadm --add /dev/md0 /dev/sdf1
強制從其他設備重建到/dev/sdf1
.擦洗陣列(選項 2)
現在我們需要修復 的 部分
/dev/sdc
已經消失的事實,並且在它們的位置上/dev/sdf
,只有零塊。echo repair >/sys/devices/virtual/block/md0/md/sync_action
最後,你應該修復它,但我不能 100% 確定恢復這些塊的行為。