Disk
SMART 測試在之前的測試失敗後沒有失敗地完成,沒有重新分配任何扇區?
我有一個驅動器在其 SMART 測試中失敗,其形式如下:
smartctl -a /dev/sdc
:... # 1 Short offline Completed: read failure 50% 6354 4377408 # 2 Extended offline Completed: read failure 90% 6354 4377408
然後我想將此“扇區”標記為壞扇區,所以我認為我只需要在其上寫入大量數據。所以我
dd
以前寫了一堆零。這填滿了驅動器,之後我又進行了一次智能測試。它成功完成,但是查看 SMART 屬性,我沒有看到任何變化:
196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0
除了完全清楚我總是面臨驅動器故障的風險之外,上述資訊是否與驅動器故障相關?
以下是 smartctl 屬性之前/之後的差異:
diff --git a/x.txt b/x.txt index 4cfe1b7..1bcace5 100644 --- a/x.txt +++ b/x.txt @@ -12,7 +12,7 @@ Sector Sizes: 512 bytes logical, 4096 bytes physical Device is: In smartctl database [for details use: -P show] ATA Version is: ACS-2 (minor revision not indicated) SATA Version is: SATA 3.0, 3.0 Gb/s (current: 3.0 Gb/s) -Local Time is: Sun Feb 24 16:50:01 2019 GMT +Local Time is: Mon Feb 25 18:33:35 2019 GMT SMART support is: Available - device has SMART capability. SMART support is: Enabled @@ -55,31 +55,38 @@ SCT capabilities: (0x70b5) SCT Status supported. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE - 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0 - 3 Spin_Up_Time 0x0027 180 179 021 Pre-fail Always - 5991 - 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 114 + 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 4 + 3 Spin_Up_Time 0x0027 177 177 021 Pre-fail Always - 6116 + 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 116 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0 - 9 Power_On_Hours 0x0032 092 092 000 Old_age Always - 6356 + 9 Power_On_Hours 0x0032 092 092 000 Old_age Always - 6372 10 Spin_Retry_Count 0x0032 100 100 000 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0 - 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 57 + 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 59 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 46 -193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 67 -194 Temperature_Celsius 0x0022 122 114 000 Old_age Always - 28 +193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 69 +194 Temperature_Celsius 0x0022 116 114 000 Old_age Always - 34 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030 100 253 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0 -200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 1 +200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 0 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error -# 1 Short offline Completed: read failure 50% 6354 4377408 -# 2 Extended offline Completed: read failure 90% 6354 4377408 +# 1 Extended offline Completed without error 00% 6367 - +# 2 Short offline Completed: read failure 60% 6361 4377409 +# 3 Short offline Completed: read failure 50% 6361 4377409 +# 4 Extended offline Completed: read failure 90% 6359 4377409 +# 5 Short offline Completed without error 00% 6359 - +# 6 Short offline Completed: read failure 60% 6356 4377409 +# 7 Short offline Completed: read failure 50% 6354 4377408 +# 8 Extended offline Completed: read failure 90% 6354 4377408 +6 of 6 failed self-tests are outdated by newer successful extended offline self-test # 1 SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
和目前的輸出
smartctl -a
:smartctl 6.6 2018-12-05 r4851 [x86_64-linux-4.14.98] (local build) Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Model Family: Western Digital AV-GP (AF) Device Model: WDC WD20EURS-63SPKY0 Serial Number: WD-WMC1T2763021 LU WWN Device Id: 5 0014ee 6addb4b7c Firmware Version: 80.00A80 User Capacity: 2,000,398,934,016 bytes [2.00 TB] Sector Sizes: 512 bytes logical, 4096 bytes physical Device is: In smartctl database [for details use: -P show] ATA Version is: ACS-2 (minor revision not indicated) SATA Version is: SATA 3.0, 3.0 Gb/s (current: 3.0 Gb/s) Local Time is: Mon Feb 25 18:49:12 2019 GMT SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x00) Offline data collection activity was never started. Auto Offline Data Collection: Disabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: (27240) seconds. Offline data collection capabilities: (0x7b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 2) minutes. Extended self-test routine recommended polling time: ( 275) minutes. Conveyance self-test routine recommended polling time: ( 5) minutes. SCT capabilities: (0x70b5) SCT Status supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 4 3 Spin_Up_Time 0x0027 177 177 021 Pre-fail Always - 6116 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 116 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0 9 Power_On_Hours 0x0032 092 092 000 Old_age Always - 6373 10 Spin_Retry_Count 0x0032 100 100 000 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 59 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 46 193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 69 194 Temperature_Celsius 0x0022 116 114 000 Old_age Always - 34 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030 100 253 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 0 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Extended offline Completed without error 00% 6367 - # 2 Short offline Completed: read failure 60% 6361 4377409 # 3 Short offline Completed: read failure 50% 6361 4377409 # 4 Extended offline Completed: read failure 90% 6359 4377409 # 5 Short offline Completed without error 00% 6359 - # 6 Short offline Completed: read failure 60% 6356 4377409 # 7 Short offline Completed: read failure 50% 6354 4377408 # 8 Extended offline Completed: read failure 90% 6354 4377408 6 of 6 failed self-tests are outdated by newer successful extended offline self-test # 1 SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay.
不,您不想將其標記為壞扇區。你想要一個對不可讀扇區的寫操作:)
正如我昨天在smartctl 中引用的那樣,報告整體健康測試已通過但測試失敗?
如果磁碟可以一次讀取該扇區的數據,並且損壞是永久性的,而不是暫時的,那麼磁碟韌體會將扇區標記為“壞”並分配一個備用扇區來替換它。但是如果磁碟一次都無法讀取扇區,那麼它就不會重新分配扇區,希望在未來的某個時間能夠從中讀取數據。**寫入不可讀(損壞)的扇區將解決問題。如果損壞是暫時的,那麼新的一致數據將被寫入該扇區。**如果損壞是永久性的,那麼寫入將強制扇區重新分配。
(部分由我加粗,原始來源:smartmontools FAQ)
昨天沒有重新分配的行業,今天也沒有重新分配的行業。
Raw_Read_Error_Rate
這意味著如果我們忽略上升到 4的事實,磁碟就壞扇區而言“與它一樣健康”。這是由離線測試引起的嗎?但是您在測試 1 和 5 中修復了不可讀的扇區。這很好。但奇怪的是,測試 2-4 也失敗了。
嗯,也許我會再執行幾次測試看看會發生什麼。並註意
Raw_Read_Error_Rate
何時執行測試或使用 dd 寫入零。