Hard-Disk

根據SMART硬碟沒有壞,但我在dmesg中有錯誤

  • June 10, 2018

有時我在啟動電腦(執行 Debian)時遇到奇怪的問題。所以我發出了“dmesg”命令。在它的輸出中,我看到了很多錯誤。但是,當我在硬碟上執行擴展 SMART 測試時(使用“smartctl -t long /dev/sda”命令),結果是我的磁碟沒有損壞。

這些錯誤的原因是什麼?

以下是錯誤:

  (...)
     [  505.918537] ata3.00: exception Emask 0x50 SAct 0x400 SErr 0x280900 action 0x6 frozen
     [  505.918549] ata3.00: irq_stat 0x08000000, interface fatal error
     [  505.918558] ata3: SError: { UnrecovData HostInt 10B8B BadCRC }
     [  505.918566] ata3.00: failed command: READ FPDMA QUEUED
     [  505.918579] ata3.00: cmd 60/40:50:20:5b:60/00:00:0b:00:00/40 tag 10 ncq 32768 in
              res 40/00:54:20:5b:60/00:00:0b:00:00/40 Emask 0x50 (ATA bus error)
     [  505.918586] ata3.00: status: { DRDY }
     [  505.918595] ata3: hard resetting link
     [  506.410055] ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
     [  506.422648] ata3.00: configured for UDMA/133
     [  506.422679] ata3: EH complete
     [ 1633.123880] md: bind<sdb3>
     [ 1633.187966] RAID1 conf printout:
     [ 1633.187977]  --- wd:1 rd:2
     [ 1633.187984]  disk 0, wo:0, o:1, dev:sda3
     [ 1633.187989]  disk 1, wo:1, o:1, dev:sdb3
     [ 1633.188866] md: recovery of RAID array md0
     [ 1633.188871] md: minimum _guaranteed_  speed: 1000 KB/sec/disk.
     [ 1633.188875] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for recovery.
     [ 1633.188890] md: using 128k window, over a total of 1943618560k.
     [ 1634.167341] ata3.00: exception Emask 0x50 SAct 0x7f80 SErr 0x280900 action 0x6 frozen
     [ 1634.167353] ata3.00: irq_stat 0x08000000, interface fatal error
     [ 1634.167361] ata3: SError: { UnrecovData HostInt 10B8B BadCRC }
     [ 1634.167369] ata3.00: failed command: READ FPDMA QUEUED
     [ 1634.167382] ata3.00: cmd 60/00:38:00:00:6f/02:00:01:00:00/40 tag 7 ncq 262144 in
              res 40/00:6c:00:0c:6f/00:00:01:00:00/40 Emask 0x50 (ATA bus error)
     [ 1634.167389] ata3.00: status: { DRDY }
     [ 1634.167395] ata3.00: failed command: READ FPDMA QUEUED
     [ 1634.167407] ata3.00: cmd 60/00:40:00:02:6f/02:00:01:00:00/40 tag 8 ncq 262144 in
              res 40/00:6c:00:0c:6f/00:00:01:00:00/40 Emask 0x50 (ATA bus error)
     [ 1634.167413] ata3.00: status: { DRDY }
     [ 1634.167418] ata3.00: failed command: READ FPDMA QUEUED
     [ 1634.167429] ata3.00: cmd 60/00:48:00:04:6f/02:00:01:00:00/40 tag 9 ncq 262144 in
              res 40/00:6c:00:0c:6f/00:00:01:00:00/40 Emask 0x50 (ATA bus error)
     [ 1634.167435] ata3.00: status: { DRDY }
     [ 1634.167439] ata3.00: failed command: READ FPDMA QUEUED
     [ 1634.167451] ata3.00: cmd 60/00:50:00:06:6f/02:00:01:00:00/40 tag 10 ncq 262144 in
              res 40/00:6c:00:0c:6f/00:00:01:00:00/40 Emask 0x50 (ATA bus error)
     [ 1634.167457] ata3.00: status: { DRDY }
     [ 1634.167462] ata3.00: failed command: READ FPDMA QUEUED
     [ 1634.167473] ata3.00: cmd 60/00:58:00:08:6f/02:00:01:00:00/40 tag 11 ncq 262144 in
              res 40/00:6c:00:0c:6f/00:00:01:00:00/40 Emask 0x50 (ATA bus error)
     [ 1634.167479] ata3.00: status: { DRDY }
     [ 1634.167484] ata3.00: failed command: READ FPDMA QUEUED
     [ 1634.167495] ata3.00: cmd 60/00:60:00:0a:6f/02:00:01:00:00/40 tag 12 ncq 262144 in
              res 40/00:6c:00:0c:6f/00:00:01:00:00/40 Emask 0x50 (ATA bus error)
     [ 1634.167500] ata3.00: status: { DRDY }
     [ 1634.167505] ata3.00: failed command: READ FPDMA QUEUED
     [ 1634.167516] ata3.00: cmd 60/80:68:00:0c:6f/00:00:01:00:00/40 tag 13 ncq 65536 in
              res 40/00:6c:00:0c:6f/00:00:01:00:00/40 Emask 0x50 (ATA bus error)
     [ 1634.167522] ata3.00: status: { DRDY }
     [ 1634.167527] ata3.00: failed command: READ FPDMA QUEUED
     [ 1634.167538] ata3.00: cmd 60/00:70:80:0c:6f/02:00:01:00:00/40 tag 14 ncq 262144 in
              res 40/00:6c:00:0c:6f/00:00:01:00:00/40 Emask 0x50 (ATA bus error)
     [ 1634.167544] ata3.00: status: { DRDY }
     [ 1634.167553] ata3: hard resetting link
     [ 1634.658816] ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
     [ 1634.672645] ata3.00: configured for UDMA/133
     [ 1634.672696] ata3: EH complete
     [ 1637.687898] ata3.00: exception Emask 0x50 SAct 0x3ff000 SErr 0x280900 action 0x6 frozen
     [ 1637.687910] ata3.00: irq_stat 0x08000000, interface fatal error
     [ 1637.687918] ata3: SError: { UnrecovData HostInt 10B8B BadCRC }
     [ 1637.687926] ata3.00: failed command: READ FPDMA QUEUED
     [ 1637.687940] ata3.00: cmd 60/00:60:80:a7:af/02:00:02:00:00/40 tag 12 ncq 262144 in
              res 40/00:ac:00:b4:af/00:00:02:00:00/40 Emask 0x50 (ATA bus error)
     [ 1637.687947] ata3.00: status: { DRDY }
     [ 1637.687953] ata3.00: failed command: READ FPDMA QUEUED
     [ 1637.687965] ata3.00: cmd 60/00:68:80:a9:af/02:00:02:00:00/40 tag 13 ncq 262144 in
              res 40/00:ac:00:b4:af/00:00:02:00:00/40 Emask 0x50 (ATA bus error)
     [ 1637.687971] ata3.00: status: { DRDY }
     [ 1637.687976] ata3.00: failed command: READ FPDMA QUEUED
     [ 1637.687987] ata3.00: cmd 60/80:70:80:ab:af/01:00:02:00:00/40 tag 14 ncq 196608 in
              res 40/00:ac:00:b4:af/00:00:02:00:00/40 Emask 0x50 (ATA bus error)
     [ 1637.687993] ata3.00: status: { DRDY }
     [ 1637.687998] ata3.00: failed command: READ FPDMA QUEUED
     [ 1637.688009] ata3.00: cmd 60/00:78:00:ad:af/02:00:02:00:00/40 tag 15 ncq 262144 in
              res 40/00:ac:00:b4:af/00:00:02:00:00/40 Emask 0x50 (ATA bus error)
     [ 1637.688015] ata3.00: status: { DRDY }
     [ 1637.688020] ata3.00: failed command: READ FPDMA QUEUED
     [ 1637.688031] ata3.00: cmd 60/80:80:00:af:af/00:00:02:00:00/40 tag 16 ncq 65536 in
              res 40/00:ac:00:b4:af/00:00:02:00:00/40 Emask 0x50 (ATA bus error)
     [ 1637.688037] ata3.00: status: { DRDY }
     [ 1637.688042] ata3.00: failed command: READ FPDMA QUEUED
     [ 1637.688053] ata3.00: cmd 60/00:88:80:af:af/01:00:02:00:00/40 tag 17 ncq 131072 in
              res 40/00:ac:00:b4:af/00:00:02:00:00/40 Emask 0x50 (ATA bus error)
     [ 1637.688059] ata3.00: status: { DRDY }
     [ 1637.688064] ata3.00: failed command: READ FPDMA QUEUED
     [ 1637.688075] ata3.00: cmd 60/80:90:80:b0:af/00:00:02:00:00/40 tag 18 ncq 65536 in
              res 40/00:ac:00:b4:af/00:00:02:00:00/40 Emask 0x50 (ATA bus error)
     [ 1637.688081] ata3.00: status: { DRDY }
     [ 1637.688085] ata3.00: failed command: READ FPDMA QUEUED
     [ 1637.688096] ata3.00: cmd 60/00:98:00:b1:af/02:00:02:00:00/40 tag 19 ncq 262144 in
              res 40/00:ac:00:b4:af/00:00:02:00:00/40 Emask 0x50 (ATA bus error)
     [ 1637.688102] ata3.00: status: { DRDY }
     [ 1637.688107] ata3.00: failed command: READ FPDMA QUEUED
     [ 1637.688118] ata3.00: cmd 60/00:a0:00:b3:af/01:00:02:00:00/40 tag 20 ncq 131072 in
              res 40/00:ac:00:b4:af/00:00:02:00:00/40 Emask 0x50 (ATA bus error)
     [ 1637.688124] ata3.00: status: { DRDY }
     [ 1637.688129] ata3.00: failed command: READ FPDMA QUEUED
     [ 1637.688140] ata3.00: cmd 60/00:a8:00:b4:af/01:00:02:00:00/40 tag 21 ncq 131072 in
              res 40/00:ac:00:b4:af/00:00:02:00:00/40 Emask 0x50 (ATA bus error)
     [ 1637.688146] ata3.00: status: { DRDY }
     [ 1637.688154] ata3: hard resetting link
     [ 1638.179398] ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
     [ 1638.192977] ata3.00: configured for UDMA/133
     [ 1638.193029] ata3: EH complete
     [ 1640.259492] md: export_rdev(sdb1)
     [ 1640.326109] md: bind<sdb1>
     [ 1640.346712] RAID1 conf printout:
     [ 1640.346724]  --- wd:1 rd:2
     [ 1640.346731]  disk 0, wo:0, o:1, dev:sda1
     [ 1640.346736]  disk 1, wo:1, o:1, dev:sdb1
     [ 1640.346893] md: delaying recovery of md1 until md0 has finished (they share one or more physical units)
     [ 1657.987964] ata3.00: exception Emask 0x50 SAct 0x40000 SErr 0x280900 action 0x6 frozen
     [ 1657.987975] ata3.00: irq_stat 0x08000000, interface fatal error
     [ 1657.987984] ata3: SError: { UnrecovData HostInt 10B8B BadCRC }
     [ 1657.987992] ata3.00: failed command: READ FPDMA QUEUED
     [ 1657.988006] ata3.00: cmd 60/00:90:00:30:2e/03:00:09:00:00/40 tag 18 ncq 393216 in
              res 40/00:94:00:30:2e/00:00:09:00:00/40 Emask 0x50 (ATA bus error)
     [ 1657.988013] ata3.00: status: { DRDY }
     [ 1657.988022] ata3: hard resetting link
     [ 1658.479548] ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
     [ 1658.493107] ata3.00: configured for UDMA/133
     [ 1658.493147] ata3: EH complete
     [ 1670.547791] ata3: limiting SATA link speed to 1.5 Gbps
     [ 1670.547805] ata3.00: exception Emask 0x50 SAct 0x7f SErr 0x280900 action 0x6 frozen
     [ 1670.547812] ata3.00: irq_stat 0x08000000, interface fatal error
     [ 1670.547820] ata3: SError: { UnrecovData HostInt 10B8B BadCRC }
     [ 1670.547826] ata3.00: failed command: READ FPDMA QUEUED
     [ 1670.547839] ata3.00: cmd 60/80:00:00:1f:2e/01:00:0c:00:00/40 tag 0 ncq 196608 in
              res 40/00:2c:00:26:2e/00:00:0c:00:00/40 Emask 0x50 (ATA bus error)
     [ 1670.547846] ata3.00: status: { DRDY }
     [ 1670.547852] ata3.00: failed command: READ FPDMA QUEUED
     [ 1670.547863] ata3.00: cmd 60/80:08:80:20:2e/00:00:0c:00:00/40 tag 1 ncq 65536 in
              res 40/00:2c:00:26:2e/00:00:0c:00:00/40 Emask 0x50 (ATA bus error)
     [ 1670.547869] ata3.00: status: { DRDY }
     [ 1670.547875] ata3.00: failed command: READ FPDMA QUEUED
     [ 1670.547886] ata3.00: cmd 60/00:10:00:21:2e/02:00:0c:00:00/40 tag 2 ncq 262144 in
              res 40/00:2c:00:26:2e/00:00:0c:00:00/40 Emask 0x50 (ATA bus error)
     [ 1670.547892] ata3.00: status: { DRDY }
     [ 1670.547896] ata3.00: failed command: READ FPDMA QUEUED
     [ 1670.547907] ata3.00: cmd 60/00:18:00:23:2e/02:00:0c:00:00/40 tag 3 ncq 262144 in
              res 40/00:2c:00:26:2e/00:00:0c:00:00/40 Emask 0x50 (ATA bus error)
     [ 1670.547913] ata3.00: status: { DRDY }
     [ 1670.547918] ata3.00: failed command: READ FPDMA QUEUED
     [ 1670.547929] ata3.00: cmd 60/00:20:00:25:2e/01:00:0c:00:00/40 tag 4 ncq 131072 in
              res 40/00:2c:00:26:2e/00:00:0c:00:00/40 Emask 0x50 (ATA bus error)
     [ 1670.547935] ata3.00: status: { DRDY }
     [ 1670.547940] ata3.00: failed command: READ FPDMA QUEUED
     [ 1670.547951] ata3.00: cmd 60/00:28:00:26:2e/02:00:0c:00:00/40 tag 5 ncq 262144 in
              res 40/00:2c:00:26:2e/00:00:0c:00:00/40 Emask 0x50 (ATA bus error)
     [ 1670.547957] ata3.00: status: { DRDY }
     [ 1670.547961] ata3.00: failed command: READ FPDMA QUEUED
     [ 1670.547972] ata3.00: cmd 60/00:30:00:28:2e/02:00:0c:00:00/40 tag 6 ncq 262144 in
              res 40/00:2c:00:26:2e/00:00:0c:00:00/40 Emask 0x50 (ATA bus error)
     [ 1670.547978] ata3.00: status: { DRDY }
     [ 1670.547987] ata3: hard resetting link
     [ 1671.039264] ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
     [ 1671.053386] ata3.00: configured for UDMA/133
     [ 1671.053444] ata3: EH complete
     [ 2422.512002] md: md0: recovery done.
     [ 2422.547344] md: recovery of RAID array md1
     [ 2422.547355] md: minimum _guaranteed_  speed: 1000 KB/sec/disk.
     [ 2422.547360] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for recovery.
     [ 2422.547378] md: using 128k window, over a total of 4877312k.
     [ 2422.668465] RAID1 conf printout:
     [ 2422.668474]  --- wd:2 rd:2
     [ 2422.668480]  disk 0, wo:0, o:1, dev:sda3
     [ 2422.668486]  disk 1, wo:0, o:1, dev:sdb3
     [ 2469.990451] md: md1: recovery done.
     [ 2470.049986] RAID1 conf printout:
     [ 2470.049997]  --- wd:2 rd:2
     [ 2470.050003]  disk 0, wo:0, o:1, dev:sda1
     [ 2470.050009]  disk 1, wo:0, o:1, dev:sdb1
     [ 3304.445149] PM: Hibernation mode set to 'platform'
     [ 3304.782375] PM: Syncing filesystems ... done.
     [ 3307.028591] Freezing user space processes ... (elapsed 0.001 seconds) done.
     (...)

首先,請記住,**SMART 說您的驅動器健康並不一定意味著驅動器健康的。**SMART 報告是一種幫助,而不是絕對的真理。

如果您感興趣的只是要做什麼,而不是為什麼,請隨意向下滾動到最後幾段;但是,臨時文本將告訴您為什麼我認為我提出的是正確的行動方案,以及如何從您發布的內容中得出這一點。

話雖如此,讓我們看看其中一個錯誤告訴我們什麼。

[ 1670.547805] ata3.00: exception Emask 0x50 SAct 0x7f SErr 0x280900 action 0x6 frozen
[ 1670.547812] ata3.00: irq_stat 0x08000000, interface fatal error
[ 1670.547820] ata3: SError: { UnrecovData HostInt 10B8B BadCRC }
[ 1670.547826] ata3.00: failed command: READ FPDMA QUEUED
[ 1670.547839] ata3.00: cmd 60/80:00:00:1f:2e/01:00:0c:00:00/40 tag 0 ncq 196608 in
          res 40/00:2c:00:26:2e/00:00:0c:00:00/40 Emask 0x50 (ATA bus error)
[ 1670.547846] ata3.00: status: { DRDY }
[ 1670.547852] ata3.00: failed command: READ FPDMA QUEUED

(我希望我得到了應該放在一起的部分,但你得到了一堆,所以無論如何都應該沒問題。)

Linux ata Wiki 有一個頁面解釋如何閱讀這些錯誤。特別,

  • status表示DRDY“設備準備就緒。通常為 1,當一切正常時。” 看到狀態值DRDY是完全正常和預期的。

  • SError有多個組件值,您會看到(在此特定程式碼段中):

    • UnrecovData“發生數據完整性錯誤,介面未恢復”
    • HostInt“主機匯流排適配器內部錯誤”
    • 10B8B“發生 10b 到 8b 解碼錯誤”
    • BadCRC“發生鏈路層 CRC 錯誤”

10b8b 編碼將 8 位編碼為 10 位,以幫助信號同步和錯誤檢測,用於物理佈線,不一定用於驅動器本身。驅動器最有可能使用其他形式的 FEC 或 ECC 編碼,並且其中的錯誤通常會顯示為某種形式的 I/O 錯誤,其error值可能為UNC(“無法糾正的錯誤 - 通常是由於磁碟上的壞扇區”) ,可能在行尾的括號中帶有“媒體錯誤”(“軟體檢測到媒體錯誤”)res。後者不是您所看到的,因此雖然我們不能完全排除它,但似乎不太可能。

“鏈路層”是驅動器自身控制器和磁碟驅動器介面晶片(可能是電腦主機板上的南橋的一部分,但可能位於板外 HBA 上)之間的物理電纜和電路板走線。

主機匯流排適配器,也稱為 HBA,是連接到儲存設備的電路。也通俗地稱為“磁碟控制器”,這個術語在現代系統中有點用詞不當。HBA 最明顯的部分通常是連接埠,現在最常見的是 SATA 或某些 SAS 外形尺寸。

和標誌基本上告訴我們“發生了可怕的錯誤UnrecovDataHostInt沒有辦法恢復或沒有嘗試恢復”。相反的可能是RecovData,這表明“發生了數據完整性錯誤,但介面已恢復”。(順便說一句,我可能會使用HBAInt而不是HostInt,因為“主機”指的是 HBA,而不是整個系統。)

10B8B和的組合BadCRC都指向物理鏈路層,這讓我懷疑是佈線問題。

這種懷疑也得到以下事實的支持,即除了狀態報告之外完全在驅動器內部進行的 SMART 自檢沒有發現製造商認為嚴重到足以保證在結果中報告的錯誤。如果驅動器在儲存或讀取數據時出現問題,特別是長時間的 SMART 自檢應該報告這一點。

TL; 博士:

因此,我要做的第一件事就是簡單地**在兩端拔下並重新插入 SATA 電纜;**它可能會稍微鬆動,導致它間歇性地失去電氣接觸。看看是否能解決問題。甚至可能值得對電腦中的所有 SATA 電纜執行此操作,而不僅僅是受影響的磁碟。如果您使用的是非板載 HBA,我也會移除並重新安裝該卡,主要是因為當您已經在弄亂佈線時嘗試它是一件容易的事情。

如果做不到這一點,**請嘗試扔掉並更換 SATA 電纜,最好使用高質量的電纜。**高質量的電纜會稍微貴一些,但我發現如果它有助於避免這樣的頭痛,通常是值得的。沒有人喜歡看到他們的儲存出現錯誤!

引用自:https://unix.stackexchange.com/questions/304661