Linux
rhel 意外重啟 + 消息文件
我們有
rhel 7.5
執行意外重啟的機器
/var/log/messages
在重新啟動之前,我們可以從文件中看到以下幾行知道這些行如何指示機器重新啟動嗎?
May 8 21:46:01 server_mng kernel: system 00:00: [io 0x1000-0x103f] could not be reserved May 8 21:46:01 server_mng kernel: system 00:00: [io 0x1040-0x104f] has been reserved May 8 21:46:01 server_mng kernel: system 00:00: [io 0x0cf0-0x0cf1] has been reserved May 8 21:46:01 server_mng kernel: system 00:04: [mem 0xfed00000-0xfed003ff] has been reserved May 8 21:46:01 server_mng kernel: system 00:05: [io 0xfce0-0xfcff] has been reserved May 8 21:46:01 server_mng kernel: system 00:05: [mem 0xf0000000-0xf7ffffff] has been reserved May 8 21:46:01 server_mng kernel: system 00:05: [mem 0xfe800000-0xfe9fffff] has been reserved May 8 21:46:01 server_mng kernel: pnp: PnP ACPI: found 6 devices May 8 21:46:01 server_mng kernel: ACPI: bus type PNP unregistered May 8 21:46:01 server_mng kernel: pci 0000:00:15.0: BAR 15: assigned [mem 0xc0000000-0xc01fffff 64bit pref] May 8 21:46:01 server_mng kernel: pci 0000:00:16.0: BAR 15: assigned [mem 0xc0200000-0xc03fffff 64bit pref] May 8 21:46:01 server_mng kernel: pci 0000:00:0f.0: BAR 6: assigned [mem 0xc0400000-0xc0407fff pref] May 8 21:46:01 server_mng kernel: pci 0000:00:15.3: BAR 13: no space for [io size 0x1000] May 8 21:46:01 server_mng kernel: pci 0000:00:15.3: BAR 13: failed to assign [io size 0x1000] May 8 21:46:01 server_mng kernel: pci 0000:00:15.4: BAR 13: no space for [io size 0x1000] May 8 21:46:01 server_mng kernel: pci 0000:00:15.4: BAR 13: failed to assign [io size 0x1000] May 8 21:46:01 server_mng kernel: pci 0000:00:01.0: PCI bridge to [bus 01] May 8 21:46:01 server_mng kernel: pci 0000:02:01.0: BAR 6: assigned [mem 0xfd500000-0xfd50ffff pref] May 8 21:46:01 server_mng kernel: pci 0000:00:11.0: PCI bridge to [bus 02] May 8 21:46:01 server_mng kernel: pci 0000:00:11.0: bridge window [io 0x2000-0x3fff] May 8 21:46:01 server_mng kernel: pci 0000:00:11.0: bridge window [mem 0xfd500000-0xfdffffff] May 8 21:46:01 server_mng kernel: pci 0000:00:11.0: bridge window [mem 0xe7b00000-0xe7ffffff 64bit pref] May 8 21:46:01 server_mng kernel: pci 0000:03:00.0: BAR 6: assigned [mem 0xfd400000-0xfd40ffff pref] May 8 21:46:01 server_mng kernel: pci 0000:00:15.0: PCI bridge to [bus 03] May 8 21:46:01 server_mng kernel: pci 0000:00:15.0: bridge window [io 0x4000-0x4fff] May 8 21:46:01 server_mng kernel: pci 0000:00:15.0: bridge window [mem 0xfd400000-0xfd4fffff] May 8 21:46:01 server_mng kernel: pci 0000:00:15.0: bridge window [mem 0xc0000000-0xc01fffff 64bit pref] May 8 21:46:01 server_mng kernel: pci 0000:00:15.1: PCI bridge to [bus 04] May 8 21:46:01 server_mng kernel: pci 0000:00:15.1: bridge window [io 0x8000-0x8fff] May 8 21:46:01 server_mng kernel: pci 0000:00:15.1: bridge window [mem 0xfd000000-0xfd0fffff] May 8 21:46:01 server_mng kernel: pci 0000:00:15.1: bridge window [mem 0xe7800000-0xe78fffff 64bit pref] May 8 21:46:01 server_mng kernel: pci 0000:00:15.2: PCI bridge to [bus 05]
這些消息是系統掃描硬體配置並將系統資源分配給各種設備的結果。通常你會在啟動序列的早期看到這些消息,基本上就在引導載入程序載入核心並啟動它之後。
如果系統不正確地分配資源,可能會導致系統立即崩潰。在這種情況下,崩潰前記錄/顯示的最後一條消息可能有助於核心開發人員確定哪些資源分配不正確,以及不正確分配的性質(重疊分配?嘗試分配沒有意義的配置?某事別的?)。如果您選擇了更詳細的引導過程(在 RHEL 中,通常刪除引導選項
rhgb
和quiet
),所有這些消息都將顯示為引導消息。如果系統具有可熱插拔的 PCI/PCI-X/PCIe/Thunderbolt 設備,您可能會在熱插拔時看到一小組類似的消息。但是,既有 PnP ACPI 資源分配也有 PCI 資源分配,而且有針對這麼多不同 PCI 設備的消息的事實支持了這些消息可能來自引導過程的結論。PCI 熱插拔事件通常會產生一組消息,其中包含一組更有限的 PCI 設備 ID 號。
這個輸出看起來基本上是在掃描(虛擬)機器所擁有的所有(虛擬)設備,而且這通常只在啟動時發生。
在對意外的系統崩潰進行故障排除時,通常在重新啟動之前記錄的消息(如果有的話)對於找出崩潰的原因最有用。
如果在重新啟動之前沒有記錄任何異常消息,則可能意味著已在虛擬化主機級別檢測到問題,並且它已經殺死並重新啟動了虛擬機 - 相當於虛擬機級別的
kill -9
, 。或者這可能意味著問題影響了儲存驅動程序,因此核心無法將錯誤消息寫入日誌。