Disk
ssd 不會掛載:壞超級塊但沒有壞塊:寫入錯誤
剛剛注意到我正在使用 SDD 作為 SSD。已更正
我需要幫助解釋這種情況。
/dev/sda
是備份的數據磁碟並具有可重現的數據,因此這不是系統關鍵,但我想避免恢復/重建數據的工作,其中一些將非常耗時是否可以恢復/修復?
如果有怎麼辦?如果我擦除磁碟以重新使用它的可靠性是什麼?
摘要(詳細報告如下):
- 不會安裝:壞超級塊
- badblocks 沒有發現壞塊
- smartctl 沒有報錯
- fsck 無法設置超級塊標誌
- fdisk 顯示乾淨的分區
- dmesg 顯示寫入錯誤
- parted 顯示 792 GB 可用 1 TB 驅動器
掛載 ssd 失敗,如下所示:
[stephen@meer ~]$ sudo mount /dev/sda1 /mnt/sda mount: /mnt/sda: can't read superblock on /dev/sda1. dmesg(1) may have more information after failed mount system call. [stephen@meer ~]$
但 badblocks 沒有發現壞塊
[root@meer stephen]# badblocks -v /dev/sda1 Checking blocks 0 to 976760831 Checking for bad blocks (read-only test): done Pass completed, 0 bad blocks found. (0/0/0 errors)
但是 smartctl 沒有發現錯誤
[root@meer stephen]# smartctl -a /dev/sda smartctl 7.3 2022-02-28 r5338 [x86_64-linux-5.17.9-arch1-1] (local build) Copyright (C) 2002-22, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Model Family: WD Blue / Red / Green SSDs Device Model: WDC WDS100T2B0A-00SM50 Serial Number: 213159800516 LU WWN Device Id: 5 001b44 8bc4fdc6e Firmware Version: 415020WD User Capacity: 1,000,204,886,016 bytes [1.00 TB] Sector Size: 512 bytes logical/physical Rotation Rate: Solid State Device Form Factor: 2.5 inches TRIM Command: Available, deterministic, zeroed Device is: In smartctl database 7.3/5319 ATA Version is: ACS-4 T13/BSR INCITS 529 revision 5 SATA Version is: SATA 3.3, 6.0 Gb/s (current: 1.5 Gb/s) Local Time is: Tue May 24 16:06:23 2022 PDT SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x00) Offline data collection activity was never started. Auto Offline Data Collection: Disabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: ( 0) seconds. Offline data collection capabilities: (0x11) SMART execute Offline immediate. No Auto Offline data collection support. Suspend Offline collection upon new command. No Offline surface scan supported. Self-test supported. No Conveyance Self-test supported. No Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 2) minutes. Extended self-test routine recommended polling time: ( 10) minutes. SMART Attributes Data Structure revision number: 4 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 5 Reallocated_Sector_Ct 0x0032 100 100 --- Old_age Always - 124 9 Power_On_Hours 0x0032 100 100 --- Old_age Always - 1470 12 Power_Cycle_Count 0x0032 100 100 --- Old_age Always - 134 165 Block_Erase_Count 0x0032 100 100 --- Old_age Always - 4312400063 166 Minimum_PE_Cycles_TLC 0x0032 100 100 --- Old_age Always - 1 167 Max_Bad_Blocks_per_Die 0x0032 100 100 --- Old_age Always - 65 168 Maximum_PE_Cycles_TLC 0x0032 100 100 --- Old_age Always - 14 169 Total_Bad_Blocks 0x0032 100 100 --- Old_age Always - 630 170 Grown_Bad_Blocks 0x0032 100 100 --- Old_age Always - 124 171 Program_Fail_Count 0x0032 100 100 --- Old_age Always - 128 172 Erase_Fail_Count 0x0032 100 100 --- Old_age Always - 0 173 Average_PE_Cycles_TLC 0x0032 100 100 --- Old_age Always - 2 174 Unexpected_Power_Loss 0x0032 100 100 --- Old_age Always - 90 184 End-to-End_Error 0x0032 100 100 --- Old_age Always - 0 187 Reported_Uncorrect 0x0032 100 100 --- Old_age Always - 0 188 Command_Timeout 0x0032 100 100 --- Old_age Always - 64 194 Temperature_Celsius 0x0022 070 053 --- Old_age Always - 30 (Min/Max 18/53) 199 UDMA_CRC_Error_Count 0x0032 100 100 --- Old_age Always - 0 230 Media_Wearout_Indicator 0x0032 001 001 --- Old_age Always - 0x002600140026 232 Available_Reservd_Space 0x0033 097 097 004 Pre-fail Always - 97 233 NAND_GB_Written_TLC 0x0032 100 100 --- Old_age Always - 2703 234 NAND_GB_Written_SLC 0x0032 100 100 --- Old_age Always - 2842 241 Host_Writes_GiB 0x0030 253 253 --- Old_age Offline - 466 242 Host_Reads_GiB 0x0030 253 253 --- Old_age Offline - 622 244 Temp_Throttle_Status 0x0032 000 100 --- Old_age Always - 0 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Extended offline Completed without error 00% 1470 - Selective Self-tests/Logging not supported
並且 fsck 失敗了:
[root@meer ~]# e2fsck -cfpv /dev/sda1 /dev/sda1: recovering journal e2fsck: Input/output error while recovering journal of /dev/sda1 e2fsck: unable to set superblock flags on /dev/sda1 /dev/sda1: ********** WARNING: Filesystem still has errors ********** May 24 15:38:29 meer kernel: I/O error, dev sda, sector 121899008 op 0x1:(WRITE) flags 0x800 phys_seg 1 prio class 0 May 24 15:38:29 meer kernel: sd 2:0:0:0: [sda] tag#31 CDB: Write(10) 2a 00 07 44 08 00 00 00 08 00 May 24 15:38:29 meer kernel: sd 2:0:0:0: [sda] tag#31 Add. Sense: Unaligned write command May 24 15:38:29 meer kernel: sd 2:0:0:0: [sda] tag#31 Sense Key : Illegal Request [current] May 24 15:38:29 meer kernel: sd 2:0:0:0: [sda] tag#31 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s May 24 15:38:29 meer kernel: ata3.00: configured for UDMA/33 May 24 15:38:29 meer kernel: ata3.00: error: { ABRT } May 24 15:38:29 meer kernel: ata3.00: status: { DRDY ERR } May 24 15:38:29 meer kernel: ata3.00: cmd ca/00:08:00:08:44/00:00:00:00:00/e7 tag 31 dma 4096 out res 51/04:08:00:08:44/00:00:07:00:00/e7 Emask 0x1 (device error) May 24 15:38:29 meer kernel: ata3.00: failed command: WRITE DMA May 24 15:38:29 meer kernel: ata3.00: irq_stat 0x40000001 May 24 15:38:29 meer kernel: ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 May 24 15:38:29 meer kernel: ata3: EH complete May 24 15:38:29 meer kernel: ata3.00: configured for UDMA/33 May 24 15:38:29 meer kernel: ata3.00: error: { ABRT } May 24 15:38:29 meer kernel: ata3.00: status: { DRDY ERR } May 24 15:38:29 meer kernel: ata3.00: cmd ca/00:08:00:08:44/00:00:00:00:00/e7 tag 6 dma 4096 out res 51/04:08:00:08:44/00:00:07:00:00/e7 Emask 0x1 (device error) May 24 15:38:29 meer kernel: ata3.00: failed command: WRITE DMA May 24 15:38:29 meer kernel: ata3.00: irq_stat 0x40000001 May 24 15:38:29 meer kernel: ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
fdisk 看到的分區。
Disk /dev/sda: 931.51 GiB, 1000204886016 bytes, 1953525168 sectors Disk model: WDC WDS100T2B0A Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disklabel type: gpt Disk identifier: 3F701164-2CF8-6D48-A94E-478634C140BE Device Start End Sectors Size Type /dev/sda1 2048 1953523711 1953521664 931.5G Linux filesystem
來自 dmesg
[ 5292.895300] ata3.00: configured for UDMA/33 [ 5292.895315] ata3: EH complete [ 5293.021851] ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 [ 5293.021859] ata3.00: irq_stat 0x40000001 [ 5293.021864] ata3.00: failed command: WRITE DMA [ 5293.021866] ata3.00: cmd ca/00:08:00:08:44/00:00:00:00:00/e7 tag 18 dma 4096 out res 51/04:08:00:08:44/00:00:07:00:00/e7 Emask 0x1 (device error) [ 5293.021874] ata3.00: status: { DRDY ERR } [ 5293.021877] ata3.00: error: { ABRT }
分開:
root@meer stephen]# parted /dev/sda GNU Parted 3.5 Using /dev/sda Welcome to GNU Parted! Type 'help' to view a list of commands. (parted) print free Model: ATA WDC WDS100T2B0A (scsi) Disk /dev/sda: 1000GB Sector size (logical/physical): 512B/512B Partition Table: gpt Disk Flags: Number Start End Size File system Name Flags 17.4kB 1049kB 1031kB Free Space 1 1049kB 1000GB 1000GB ext4 1000GB 1000GB 729kB Free Space
我不知道你一直在用這個磁碟做什麼,但這是瘋狂的數字!查看 SSD 一直打開的輸出:
- 1470 小時(61 天)
- 執行 4312400063 (2.0GiB) 塊擦除
- 163210068006 (76TiB) 媒體寫入。
在 61 天內,每秒寫入量恆定為 16MiB。
我想你有內部 NAND 故障。您可能無法取回您的數據。
我建議您今後最好的解決方案是使用某種形式的 raid 鏡像來緩衝多個磁碟之間的錯誤。
理想情況下,嘗試在多個磁碟之間分散錯誤和故障的分佈是兩個不同年齡和/或不同生產批次的磁碟。
澄清一下,我認為在很短的時間內異常大量的寫入。您需要將其考慮到您使用的儲存設置中。