Linux

RHEL + 如何驗證 dmesg 日誌以找到意外重啟的根本原因

  • July 7, 2021

我們有 RHEL 7.6 伺服器(VM 伺服器)

執行了兩次意外重啟(我們可以看到上一個命令的重啟)

在查看 dmesg 輸出後,我們可以看到以下消息,

Jul  3 09:56:42 server_MA02 kernel: ata12: SATA link down (SStatus 0 SControl 300)
Jul  3 09:56:42 server_MA02 kernel: ata5: SATA link down (SStatus 0 SControl 300)
Jul  3 09:56:42 server_MA02 kernel: ata11: SATA link down (SStatus 0 SControl 300)
Jul  3 09:56:42 server_MA02 kernel: ata4: SATA link down (SStatus 0 SControl 300)
Jul  3 09:56:42 server_MA02 kernel: ata10: SATA link down (SStatus 0 SControl 300)
Jul  3 09:56:42 server_MA02 kernel: ata14: SATA link down (SStatus 0 SControl 300)
Jul  3 09:56:42 server_MA02 kernel: ata13: SATA lin

k down (SStatus 0 SControl 300)
Jul  3 09:56:42 server_MA02 kernel: ata15: SATA link down (SStatus 0 SControl 300)
Jul  3 09:56:42 server_MA02 kernel: ata16: SATA link down (SStatus 0 SControl 300)
Jul  3 09:56:42 server_MA02 kernel: ata20: SATA link down (SStatus 0 SControl 300)
Jul  3 09:56:42 server_MA02 kernel: ata21: SATA link down (SStatus 0 SControl 300)
Jul  3 09:56:42 server_MA02 kernel: ata26: SATA link down (SStatus 0 SControl 300)
Jul  3 09:56:42 server_MA02 kernel: ata19: SATA link down (SStatus 0 SControl 300)
Jul  3 09:56:42 server_MA02 kernel: ata29: SATA link down (SStatus 0 SControl 300)
Jul  3 09:56:42 server_MA02 kernel: ata32: SATA link down (SStatus 0 SControl 300)
Jul  3 09:56:42 server_MA02 kernel: ata28: SATA link down (SStatus 0 SControl 300)
Jul  3 09:56:42 server_MA02 kernel: ata31: SATA link down (SStatus 0 SControl 300)
Jul  3 09:56:42 server_MA02 kernel: ata25: SATA link down (SStatus 0 SControl 300)
Jul  3 09:56:42 server_MA02 kernel: ata30: SATA link down (SStatus 0 SControl 300)
Jul  3 09:56:42 server_MA02 kernel: ata22: SATA link down (SStatus 0 SControl 300)
Jul  3 09:56:42 server_MA02 kernel: ata18: SATA link down (SStatus 0 SControl 300)
Jul  3 09:56:42 server_MA02 kernel: ata17: SATA link down (SStatus 0 SControl 300)
Jul  3 09:56:42 server_MA02 kernel: ata23: SATA link down (SStatus 0 SControl 300)
Jul  3 09:56:42 server_MA02 kernel: ata24: SATA link down (SStatus 0 SControl 300)
Jul  3 09:56:42 server_MA02 kernel: ata27: SATA link down (SStatus 0 SControl 300)

上述消息是否應該是 VM 機器執行意外重啟問題的一部分?

當您說“VM 伺服器”時,您的意思是伺服器是託管 VM 的物理機,還是伺服器是虛擬機?

如果日誌來自物理機,則似乎有大量 SATA 連結立即死亡;可能是電源問題,還是 SATA 控制器問題?

如果伺服器是虛擬機,這可能意味著主機突然停止向虛擬機提供虛擬磁碟:您應該檢查物理主機的日誌(或聯繫其管理員,如果它是由其他人託管的)以查看主機是否有某種硬體問題。

引用自:https://unix.stackexchange.com/questions/657335