Linux
NUC 與 Centos 7 崩潰:檢測到微碼 SW 錯誤
我試圖弄清楚為什麼我的英特爾 NUC 上的 CentOS 7 上的 WiFi 繼續當機。作為資訊,我有 5 個節點的 Hadoop 集群,它們都配置相同(據我所知),但是,WiFi 上的其他機器不會崩潰。我不知道這台機器出了什麼問題。
這是來自 的錯誤
/var/log/messages
。這是我經常看到的相同消息,因為我已經觀察了這個問題好幾天了。Jan 2 08:41:06 mapr04 kernel: iwlwifi 0000:3a:00.0: Microcode SW error detected. Restarting 0x2000000. Jan 2 08:41:06 mapr04 kernel: iwlwifi 0000:3a:00.0: Start IWL Error Log Dump: Jan 2 08:41:06 mapr04 kernel: iwlwifi 0000:3a:00.0: Status: 0x00000100, count: 6 Jan 2 08:41:06 mapr04 kernel: iwlwifi 0000:3a:00.0: Loaded firmware version: 34.0.1 Jan 2 08:41:06 mapr04 kernel: iwlwifi 0000:3a:00.0: 0x000022CE | ADVANCED_SYSASSERT Jan 2 08:41:06 mapr04 kernel: iwlwifi 0000:3a:00.0: 0x05900280 | trm_hw_status0 Jan 2 08:41:06 mapr04 kernel: iwlwifi 0000:3a:00.0: 0x00000000 | trm_hw_status1 Jan 2 08:41:06 mapr04 kernel: iwlwifi 0000:3a:00.0: 0x00023FDC | branchlink2 Jan 2 08:41:06 mapr04 kernel: iwlwifi 0000:3a:00.0: 0x0003915A | interruptlink1 Jan 2 08:41:06 mapr04 kernel: iwlwifi 0000:3a:00.0: 0x00000000 | interruptlink2 Jan 2 08:41:06 mapr04 kernel: iwlwifi 0000:3a:00.0: 0x0000012C | data1 Jan 2 08:41:06 mapr04 kernel: iwlwifi 0000:3a:00.0: 0x03830000 | data2 Jan 2 08:41:06 mapr04 kernel: iwlwifi 0000:3a:00.0: 0xDEADBEEF | data3 Jan 2 08:41:06 mapr04 kernel: iwlwifi 0000:3a:00.0: 0xD28011F1 | beacon time Jan 2 08:41:06 mapr04 kernel: iwlwifi 0000:3a:00.0: 0x72F4FDDD | tsf low Jan 2 08:41:06 mapr04 kernel: iwlwifi 0000:3a:00.0: 0x00000182 | tsf hi Jan 2 08:41:06 mapr04 kernel: iwlwifi 0000:3a:00.0: 0x00000000 | time gp1 Jan 2 08:41:06 mapr04 kernel: iwlwifi 0000:3a:00.0: 0xCA511FA7 | time gp2 Jan 2 08:41:06 mapr04 kernel: iwlwifi 0000:3a:00.0: 0x00000001 | uCode revision type Jan 2 08:41:06 mapr04 kernel: iwlwifi 0000:3a:00.0: 0x00000022 | uCode version major Jan 2 08:41:06 mapr04 kernel: iwlwifi 0000:3a:00.0: 0x00000000 | uCode version minor Jan 2 08:41:06 mapr04 kernel: iwlwifi 0000:3a:00.0: 0x00000230 | hw version Jan 2 08:41:06 mapr04 kernel: iwlwifi 0000:3a:00.0: 0x00C89000 | board version Jan 2 08:41:06 mapr04 kernel: iwlwifi 0000:3a:00.0: 0x0A96001C | hcmd Jan 2 08:41:06 mapr04 kernel: iwlwifi 0000:3a:00.0: 0xA7F93882 | isr0 Jan 2 08:41:06 mapr04 kernel: iwlwifi 0000:3a:00.0: 0x00050000 | isr1 Jan 2 08:41:06 mapr04 kernel: iwlwifi 0000:3a:00.0: 0x0020180A | isr2 Jan 2 08:41:06 mapr04 kernel: iwlwifi 0000:3a:00.0: 0x40417DCD | isr3 Jan 2 08:41:06 mapr04 kernel: iwlwifi 0000:3a:00.0: 0x00000000 | isr4 Jan 2 08:41:06 mapr04 kernel: iwlwifi 0000:3a:00.0: 0x0A95001C | last cmd Id Jan 2 08:41:06 mapr04 kernel: iwlwifi 0000:3a:00.0: 0x00000000 | wait_event Jan 2 08:41:06 mapr04 kernel: iwlwifi 0000:3a:00.0: 0x00004288 | l2p_control Jan 2 08:41:06 mapr04 kernel: iwlwifi 0000:3a:00.0: 0x00018024 | l2p_duration Jan 2 08:41:06 mapr04 kernel: iwlwifi 0000:3a:00.0: 0x00000000 | l2p_mhvalid Jan 2 08:41:06 mapr04 kernel: iwlwifi 0000:3a:00.0: 0x000000EF | l2p_addr_match Jan 2 08:41:06 mapr04 kernel: iwlwifi 0000:3a:00.0: 0x0000000D | lmpm_pmg_sel Jan 2 08:41:06 mapr04 kernel: iwlwifi 0000:3a:00.0: 0x30101345 | timestamp Jan 2 08:41:06 mapr04 kernel: iwlwifi 0000:3a:00.0: 0x00007888 | flow_handler Jan 2 08:41:06 mapr04 kernel: iwlwifi 0000:3a:00.0: Start IWL Error Log Dump: Jan 2 08:41:06 mapr04 kernel: iwlwifi 0000:3a:00.0: Status: 0x00000100, count: 7 Jan 2 08:41:06 mapr04 kernel: iwlwifi 0000:3a:00.0: 0x00000070 | ADVANCED_SYSASSERT Jan 2 08:41:06 mapr04 kernel: iwlwifi 0000:3a:00.0: 0x00000000 | umac branchlink1 Jan 2 08:41:06 mapr04 kernel: iwlwifi 0000:3a:00.0: 0xC0086964 | umac branchlink2 Jan 2 08:41:06 mapr04 kernel: iwlwifi 0000:3a:00.0: 0xC0083A94 | umac interruptlink1 Jan 2 08:41:06 mapr04 kernel: iwlwifi 0000:3a:00.0: 0xC0083A94 | umac interruptlink2 Jan 2 08:41:06 mapr04 kernel: iwlwifi 0000:3a:00.0: 0x00000800 | umac data1 Jan 2 08:41:06 mapr04 kernel: iwlwifi 0000:3a:00.0: 0xC0083A94 | umac data2 Jan 2 08:41:06 mapr04 kernel: iwlwifi 0000:3a:00.0: 0xDEADBEEF | umac data3 Jan 2 08:41:06 mapr04 kernel: iwlwifi 0000:3a:00.0: 0x00000022 | umac major Jan 2 08:41:06 mapr04 kernel: iwlwifi 0000:3a:00.0: 0x00000000 | umac minor Jan 2 08:41:06 mapr04 kernel: iwlwifi 0000:3a:00.0: 0xC088628C | frame pointer Jan 2 08:41:06 mapr04 kernel: iwlwifi 0000:3a:00.0: 0xC088628C | stack pointer Jan 2 08:41:06 mapr04 kernel: iwlwifi 0000:3a:00.0: 0x00DF019C | last host cmd Jan 2 08:41:06 mapr04 kernel: iwlwifi 0000:3a:00.0: 0x00000000 | isr status reg Jan 2 08:41:06 mapr04 kernel: ieee80211 phy0: Hardware restart was requested Jan 2 08:41:06 mapr04 kernel: iwlwifi 0000:3a:00.0: FW error in SYNC CMD STATISTICS_CMD Jan 2 08:41:06 mapr04 kernel: CPU: 0 PID: 4898 Comm: NetworkManager Kdump: loaded Not tainted 3.10.0-957.1.3.el7.x86_64 #1 Jan 2 08:41:06 mapr04 kernel: Hardware name: Intel Corporation NUC7i7BNH/NUC7i7BNB, BIOS BNKBL357.86A.0049.2017.0724.1541 07/24/2017 Jan 2 08:41:06 mapr04 kernel: Call Trace: Jan 2 08:41:06 mapr04 kernel: [<ffffffffaeb61e41>] dump_stack+0x19/0x1b Jan 2 08:41:06 mapr04 kernel: [<ffffffffc0afa983>] iwl_trans_pcie_send_hcmd+0x563/0x580 [iwlwifi] Jan 2 08:41:06 mapr04 kernel: [<ffffffffae4c2d00>] ? wake_up_atomic_t+0x30/0x30 Jan 2 08:41:06 mapr04 kernel: [<ffffffffc0b060fc>] iwl_trans_send_cmd+0x5c/0xe0 [iwlwifi] Jan 2 08:41:06 mapr04 kernel: [<ffffffffc0c6d312>] iwl_mvm_send_cmd+0x32/0xb0 [iwlmvm] Jan 2 08:41:06 mapr04 kernel: [<ffffffffc0c6e632>] iwl_mvm_request_statistics+0x72/0x100 [iwlmvm] Jan 2 08:41:06 mapr04 kernel: [<ffffffffc0c616fe>] iwl_mvm_mac_sta_statistics+0xbe/0x100 [iwlmvm] Jan 2 08:41:06 mapr04 kernel: [<ffffffffc0bb68f7>] sta_set_sinfo+0xb7/0x800 [mac80211] Jan 2 08:41:06 mapr04 kernel: [<ffffffffc0bcd052>] ieee80211_get_station+0x52/0x80 [mac80211] Jan 2 08:41:06 mapr04 kernel: [<ffffffffc08cae41>] nl80211_get_station+0xa1/0x240 [cfg80211] Jan 2 08:41:06 mapr04 kernel: [<ffffffffae794d0d>] ? list_del+0xd/0x30 Jan 2 08:41:06 mapr04 kernel: [<ffffffffae5bdf1a>] ? __rmqueue+0x8a/0x460 Jan 2 08:41:06 mapr04 kernel: [<ffffffffaea77918>] genl_family_rcv_msg+0x208/0x430 Jan 2 08:41:06 mapr04 kernel: [<ffffffffae5bf134>] ? free_one_page+0x2e4/0x310 Jan 2 08:41:06 mapr04 kernel: [<ffffffffaea77b9b>] genl_rcv_msg+0x5b/0xc0 Jan 2 08:41:06 mapr04 kernel: [<ffffffffaea73ec0>] ? __netlink_lookup+0xc0/0x110 Jan 2 08:41:06 mapr04 kernel: [<ffffffffaea77b40>] ? genl_family_rcv_msg+0x430/0x430 Jan 2 08:41:06 mapr04 kernel: [<ffffffffaea75bab>] netlink_rcv_skb+0xab/0xc0 Jan 2 08:41:06 mapr04 kernel: [<ffffffffaea760e8>] genl_rcv+0x28/0x40 Jan 2 08:41:06 mapr04 kernel: [<ffffffffaea75530>] netlink_unicast+0x170/0x210 Jan 2 08:41:06 mapr04 kernel: [<ffffffffae78c042>] ? memcpy_fromiovec+0x62/0xb0 Jan 2 08:41:06 mapr04 kernel: [<ffffffffaea758d8>] netlink_sendmsg+0x308/0x420 Jan 2 08:41:06 mapr04 kernel: [<ffffffffaea73112>] ? netlink_recvmsg+0x212/0x490 Jan 2 08:41:06 mapr04 kernel: [<ffffffffaea193a6>] sock_sendmsg+0xb6/0xf0 Jan 2 08:41:06 mapr04 kernel: [<ffffffffaea194f5>] ? sock_recvmsg+0xc5/0x100 Jan 2 08:41:06 mapr04 kernel: [<ffffffffaea1a269>] ___sys_sendmsg+0x3e9/0x400 Jan 2 08:41:06 mapr04 kernel: [<ffffffffae656fe0>] ? __pollwait+0xf0/0xf0 Jan 2 08:41:06 mapr04 kernel: [<ffffffffae68ee1e>] ? ep_poll+0x31e/0x360 Jan 2 08:41:06 mapr04 kernel: [<ffffffffaea1b921>] __sys_sendmsg+0x51/0x90 Jan 2 08:41:06 mapr04 kernel: [<ffffffffaea1b972>] SyS_sendmsg+0x12/0x20 Jan 2 08:41:06 mapr04 kernel: [<ffffffffaeb74ddb>] system_call_fastpath+0x22/0x27
我應該從哪裡開始嘗試調試?我可以編輯帶有更新的原始文章。
以下是一些我認為可能會有所幫助的事情:
uname -a
:Linux mapr04.wired.carnoustie 3.10.0-957.1.3.el7.x86_64 #1 SMP Thu Nov 29 14:49:43 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
dmesg | grep iwlwifi
:[ 3.822041] iwlwifi 0000:3a:00.0: irq 132 for MSI/MSI-X [ 3.831295] iwlwifi 0000:3a:00.0: loaded firmware version 34.0.1 op_mode iwlmvm [ 3.924043] iwlwifi 0000:3a:00.0: Detected Intel(R) Dual Band Wireless AC 8265, REV=0x230 [ 3.984049] iwlwifi 0000:3a:00.0: base HW address: f8:94:c2:5c:07:24
這是最新的錯誤:
Here is the latest error: Jan 5 03:17:01 mapr04 kernel: iwlwifi 0000:3a:00.0: Error sending STATISTICS_CMD: time out after 2000ms. Jan 5 03:17:01 mapr04 kernel: iwlwifi 0000:3a:00.0: Current CMD queue read_ptr 246 write_ptr 247 Jan 5 03:17:01 mapr04 kernel: iwlwifi 0000:3a:00.0: Start IWL Error Log Dump: Jan 5 03:17:01 mapr04 kernel: iwlwifi 0000:3a:00.0: Status: 0x00000100, count: 6 Jan 5 03:17:01 mapr04 kernel: iwlwifi 0000:3a:00.0: Loaded firmware version: 34.0.1 Jan 5 03:17:01 mapr04 kernel: iwlwifi 0000:3a:00.0: 0x00000084 | NMI_INTERRUPT_UNKNOWN Jan 5 03:17:01 mapr04 kernel: iwlwifi 0000:3a:00.0: 0x00000280 | trm_hw_status0 Jan 5 03:17:01 mapr04 kernel: iwlwifi 0000:3a:00.0: 0x00000000 | trm_hw_status1 Jan 5 03:17:01 mapr04 kernel: iwlwifi 0000:3a:00.0: 0x00023FDC | branchlink2 Jan 5 03:17:01 mapr04 kernel: iwlwifi 0000:3a:00.0: 0x0003915A | interruptlink1 Jan 5 03:17:01 mapr04 kernel: iwlwifi 0000:3a:00.0: 0x0003915A | interruptlink2 Jan 5 03:17:01 mapr04 kernel: iwlwifi 0000:3a:00.0: 0x00000000 | data1 Jan 5 03:17:01 mapr04 kernel: iwlwifi 0000:3a:00.0: 0x00000080 | data2 Jan 5 03:17:01 mapr04 kernel: iwlwifi 0000:3a:00.0: 0x03830000 | data3 Jan 5 03:17:01 mapr04 kernel: iwlwifi 0000:3a:00.0: 0xD4C029D9 | beacon time Jan 5 03:17:01 mapr04 kernel: iwlwifi 0000:3a:00.0: 0x456535F1 | tsf low Jan 5 03:17:01 mapr04 kernel: iwlwifi 0000:3a:00.0: 0x000001BA | tsf hi Jan 5 03:17:01 mapr04 kernel: iwlwifi 0000:3a:00.0: 0x00000000 | time gp1 Jan 5 03:17:01 mapr04 kernel: iwlwifi 0000:3a:00.0: 0x6D3BFE27 | time gp2 Jan 5 03:17:01 mapr04 kernel: iwlwifi 0000:3a:00.0: 0x00000001 | uCode revision type Jan 5 03:17:01 mapr04 kernel: iwlwifi 0000:3a:00.0: 0x00000022 | uCode version major Jan 5 03:17:01 mapr04 kernel: iwlwifi 0000:3a:00.0: 0x00000000 | uCode version minor Jan 5 03:17:01 mapr04 kernel: iwlwifi 0000:3a:00.0: 0x00000230 | hw version Jan 5 03:17:01 mapr04 kernel: iwlwifi 0000:3a:00.0: 0x00C89000 | board version Jan 5 03:17:01 mapr04 kernel: iwlwifi 0000:3a:00.0: 0x0000001C | hcmd Jan 5 03:17:01 mapr04 kernel: iwlwifi 0000:3a:00.0: 0x00012000 | isr0 Jan 5 03:17:01 mapr04 kernel: iwlwifi 0000:3a:00.0: 0x00000000 | isr1 Jan 5 03:17:01 mapr04 kernel: iwlwifi 0000:3a:00.0: 0x0000180A | isr2 Jan 5 03:17:01 mapr04 kernel: iwlwifi 0000:3a:00.0: 0x00417CC0 | isr3 Jan 5 03:17:01 mapr04 kernel: iwlwifi 0000:3a:00.0: 0x00000000 | isr4 Jan 5 03:17:01 mapr04 kernel: iwlwifi 0000:3a:00.0: 0x0A89001C | last cmd Id Jan 5 03:17:01 mapr04 kernel: iwlwifi 0000:3a:00.0: 0x00000000 | wait_event Jan 5 03:17:01 mapr04 kernel: iwlwifi 0000:3a:00.0: 0x00004288 | l2p_control Jan 5 03:17:01 mapr04 kernel: iwlwifi 0000:3a:00.0: 0x00018024 | l2p_duration Jan 5 03:17:01 mapr04 kernel: iwlwifi 0000:3a:00.0: 0x00000000 | l2p_mhvalid Jan 5 03:17:01 mapr04 kernel: iwlwifi 0000:3a:00.0: 0x000000EF | l2p_addr_match Jan 5 03:17:01 mapr04 kernel: iwlwifi 0000:3a:00.0: 0x0000000D | lmpm_pmg_sel Jan 5 03:17:01 mapr04 kernel: iwlwifi 0000:3a:00.0: 0x30101345 | timestamp Jan 5 03:17:01 mapr04 kernel: iwlwifi 0000:3a:00.0: 0x00002838 | flow_handler Jan 5 03:17:01 mapr04 kernel: iwlwifi 0000:3a:00.0: Start IWL Error Log Dump: Jan 5 03:17:01 mapr04 kernel: iwlwifi 0000:3a:00.0: Status: 0x00000100, count: 7 Jan 5 03:17:01 mapr04 kernel: iwlwifi 0000:3a:00.0: 0x00000070 | ADVANCED_SYSASSERT Jan 5 03:17:01 mapr04 kernel: iwlwifi 0000:3a:00.0: 0x00000000 | umac branchlink1 Jan 5 03:17:01 mapr04 kernel: iwlwifi 0000:3a:00.0: 0xC0086964 | umac branchlink2 Jan 5 03:17:01 mapr04 kernel: iwlwifi 0000:3a:00.0: 0xC0083A94 | umac interruptlink1 Jan 5 03:17:01 mapr04 kernel: iwlwifi 0000:3a:00.0: 0xC0083A94 | umac interruptlink2 Jan 5 03:17:01 mapr04 kernel: iwlwifi 0000:3a:00.0: 0x00000800 | umac data1 Jan 5 03:17:01 mapr04 kernel: iwlwifi 0000:3a:00.0: 0xC0083A94 | umac data2 Jan 5 03:17:01 mapr04 kernel: iwlwifi 0000:3a:00.0: 0xDEADBEEF | umac data3 Jan 5 03:17:01 mapr04 kernel: iwlwifi 0000:3a:00.0: 0x00000022 | umac major Jan 5 03:17:01 mapr04 kernel: iwlwifi 0000:3a:00.0: 0x00000000 | umac minor Jan 5 03:17:01 mapr04 kernel: iwlwifi 0000:3a:00.0: 0xC088628C | frame pointer Jan 5 03:17:01 mapr04 kernel: iwlwifi 0000:3a:00.0: 0xC088628C | stack pointer Jan 5 03:17:01 mapr04 kernel: iwlwifi 0000:3a:00.0: 0x00F6019C | last host cmd Jan 5 03:17:01 mapr04 kernel: iwlwifi 0000:3a:00.0: 0x00000000 | isr status reg Jan 5 03:17:01 mapr04 kernel: ieee80211 phy0: Hardware restart was requested Jan 5 03:17:01 mapr04 kernel: iwlwifi 0000:3a:00.0: Microcode SW error detected. Restarting 0x2000000.
似乎 wifi 驅動程序無法管理 NUC 中的 wifi 硬體。
- 幾個 linux 發行版可以在不安裝的情況下實時試用。我認為 NUC 有 Intel wifi,它應該可以與內置的 linux 驅動程序一起使用,但它們必須足夠新。
- 我有一個帶有英特爾第 6 代硬體的 NUC。我注意到舊版本的作業系統無法管理 wifi 硬體,但新版本無需任何調整即可管理它,“開箱即用”。
- 編輯 1:我使用實時系統進行了測試:Ubuntu 18.04.1 LTS 可以管理我的 NUC6i3SYH 的有線和無線硬體。Debian 9,Stretch,可以自動管理有線網路。我的 wifi 失敗了,但其他人可能會修復它(我不知道是否存在驅動程序問題,或者我是否無法在 Debian 中管理 wifi 的使用者界面。)
- 編輯 2:我下載
CentOS-7-x86_64-LiveGNOME-1810.iso
並實時執行它,它可以管理我的 NUC6i3SYH 的有線和無線硬體。它的啟動與使用 Ubuntu 18.04.1 LTS 一樣容易。但是我很久沒有測試穩定性了。編輯3:您應該考慮硬體可能已損壞(例如變熱時失敗)。但是,如果它與其他作業系統一起執行良好,則可以得出硬體良好的結論。
- 你的 NUC 硬體是什麼時候開發的,CentOS 7 軟體是什麼時候開發的?
- Centos 7 有一個舊的核心系列,3.10;實時系統“1810”中的核心版本是
3.10.0-957.el7.x86_64 #1 SMP
. Ubuntu 18.04.1 live 具有核心版本 4.15.0-29,最新安裝的系統具有 4.15.0-43。- 請嘗試使用具有較新 linux 核心和較新硬體驅動程序的另一個作業系統。