Zfs

Linux 上的 ZFS - 設備故障後的意外行為

  • October 2, 2021

我維護一個帶有 ZFS 儲存池 (RAID Z3) 的 Debian 伺服器。最近 ZFS 報告了兩個磁碟同時發生故障:

ZFS has detected that a device was removed.

impact: Fault tolerance of the pool may be compromised.
   eid: 138
 class: statechange
 state: REMOVED
  host: serres-west-wing
  time: 2021-04-30 01:30:15+0300
 vpath: /dev/disk/by-vdev/d0-part1
 vguid: 0x6622AF6B1929E199
  pool: 0x0964CF6A3748D7A9
ZFS has detected that a device was removed.

impact: Fault tolerance of the pool may be compromised.
   eid: 140
 class: statechange
 state: REMOVED
  host: serres-west-wing
  time: 2021-04-30 01:30:15+0300
 vpath: /dev/disk/by-vdev/d1-part1
 vguid: 0xD48BA6B066788199
  pool: 0x0964CF6A3748D7A9

生成這些消息後,熱備用已啟動並立即開始重新同步。重新同步後池的狀態如下:

ZFS has finished a resilver:

  eid: 167
class: resilver_finish
 host: serres-west-wing
 time: 2021-04-30 02:15:03+0300
 pool: datapool
state: ONLINE
 scan: resilvered 132G in 00:44:41 with 0 errors on Fri Apr 30 02:15:03 2021
config:

       NAME               STATE     READ WRITE CKSUM
       datapool           ONLINE       0     0     0
         raidz2-0         ONLINE       0     0     0
           spare-0        ONLINE       0     0     0
             d0-part1     ONLINE       0     0     0
             hs-d0-part1  ONLINE       0     0     0
           d1-part1       ONLINE       0     0     0
           d2-part1       ONLINE       0     0     0
           d3-part1       ONLINE       0     0     0
           d4-part1       ONLINE       0     0     0
       logs
         mirror-1         ONLINE       0     0     0
           zil-d0-part1   ONLINE       0     0     0
           zil-d1-part1   ONLINE       0     0     0
       cache
         l2arc-d0-part2   ONLINE       0     0     0
         l2arc-d1-part2   ONLINE       0     0     0
       spares
         hs-d0-part1      INUSE     currently in use

errors: No known data errors

磁碟似乎已連接並且工作正常d0-part1d1-part1

這是由於與磁碟降級無關的因素導致的錯誤嗎?兩個工作磁碟似乎不太可能同時發生故障。停用熱備件是否安全?

似乎磁碟斷開是由電源問題引起的。為機器升級 UPS 後,我沒有遇到任何問題。我已停用熱備件已停用

zpool detach datapool hs-d0-part1

然後我重新銀化了游泳池

zpool scrud datapool

將池恢復到其原始狀態。

引用自:https://unix.stackexchange.com/questions/647575