軟鎖定後網路斷開
我在 Ubuntu Server 20.04 上使用 AMD Athlon 3200G 執行系統,最近切換到 Ryzen 7 1800X。使用 3200G 一切都很好,沒有任何問題。
但是,由於我已經切換到新的 CPU,每次正常執行幾個小時後網路開始斷開連接。
所以我重新安裝了 Ubuntu Server 20.04 並重新安裝了所有服務(主要是一些 docker 容器和 docker 內的反向代理)。然而這並沒有解決問題,同樣的事情又發生了。
查看 journalctl 時,我注意到大約在系統開始出現故障/斷開連接的時候,有幾條錯誤消息提到了軟鎖定和 CPU 卡住了約 20 秒(日誌附加在末尾)。這些消息大約每 30 秒記錄一次。
我不會認為自己是使用標準 linux 的初學者,但我對核心的了解相當有限,不幸的是我無法理解錯誤消息。
也許有人知道發生了什麼或可以幫助我破譯這些消息,如果有人可以提供幫助,我會非常高興,在此先感謝!
Jul 19 17:21:16 ld-nas kernel: Modules linked in: veth xt_nat xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo iptable_nat nf_nat aufs quota_v2 quota_tree nls_iso8859_1 dm_multipath scsi_dh_rd> Jul 19 17:21:16 ld-nas kernel: glue_helper r8169 i2c_piix4 realtek ahci libahci wmi gpio_amdpt gpio_generic Jul 19 17:21:16 ld-nas kernel: CPU: 7 PID: 25166 Comm: (imesyncd) Tainted: G L 5.4.0-77-generic #86-Ubuntu Jul 19 17:21:16 ld-nas kernel: Hardware name: Gigabyte Technology Co., Ltd. B450M S2H/B450M S2H, BIOS F50 11/27/2019 Jul 19 17:21:16 ld-nas kernel: RIP: 0010:smp_call_function_many+0x208/0x270 Jul 19 17:21:16 ld-nas kernel: Code: 92 00 3b 05 de d2 70 01 89 c7 0f 83 9b fe ff ff 48 63 c7 48 8b 0b 48 03 0c c5 80 89 64 98 8b 41 18 a8 01 74 0a f3 90 8b 51 18 <83> e2 01 75 f6 eb c8 89 cf 48 c7 c2 20 b8 a> Jul 19 17:21:16 ld-nas kernel: RSP: 0018:ffff9e2643fe7b60 EFLAGS: 00000202 ORIG_RAX: ffffffffffffff13 Jul 19 17:21:16 ld-nas kernel: RAX: 0000000000000003 RBX: ffff892ffe9ebd40 RCX: ffff892ffe8323e0 Jul 19 17:21:16 ld-nas kernel: RDX: 0000000000000003 RSI: 0000000000000000 RDI: 0000000000000000 Jul 19 17:21:16 ld-nas kernel: RBP: ffff9e2643fe7ba0 R08: ffff892ffcc38538 R09: ffff892ffcc38ec0 Jul 19 17:21:16 ld-nas kernel: R10: ffff892ffcc38538 R11: 0000000000000000 R12: ffffffff97281930 Jul 19 17:21:16 ld-nas kernel: R13: 0000000000000000 R14: 0000000000000001 R15: 0000000000000020 Jul 19 17:21:16 ld-nas kernel: FS: 00007fea61f31980(0000) GS:ffff892ffe9c0000(0000) knlGS:0000000000000000 Jul 19 17:21:16 ld-nas kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Jul 19 17:21:16 ld-nas kernel: CR2: 000055bd11646d18 CR3: 00000003e8150000 CR4: 00000000003406e0 Jul 19 17:21:16 ld-nas kernel: Call Trace: Jul 19 17:21:16 ld-nas kernel: ? load_new_mm_cr3+0xf0/0xf0 Jul 19 17:21:16 ld-nas kernel: on_each_cpu+0x2d/0x60 Jul 19 17:21:16 ld-nas kernel: flush_tlb_kernel_range+0x38/0x90 Jul 19 17:21:16 ld-nas kernel: __purge_vmap_area_lazy+0x70/0x6d0 Jul 19 17:21:16 ld-nas kernel: _vm_unmap_aliases+0xf5/0x130 Jul 19 17:21:16 ld-nas kernel: vm_unmap_aliases+0x19/0x20 Jul 19 17:21:16 ld-nas kernel: change_page_attr_set_clr+0xcf/0x200 Jul 19 17:21:16 ld-nas kernel: set_memory_ro+0x29/0x30 Jul 19 17:21:16 ld-nas kernel: bpf_int_jit_compile+0x2d1/0x340 Jul 19 17:21:16 ld-nas kernel: bpf_prog_select_runtime+0xa7/0x130 Jul 19 17:21:16 ld-nas kernel: bpf_prepare_filter+0x44c/0x4b0 Jul 19 17:21:16 ld-nas kernel: ? hardlockup_detector_perf_cleanup+0xa0/0xa0 Jul 19 17:21:16 ld-nas kernel: bpf_prog_create_from_user+0xc7/0x120 Jul 19 17:21:16 ld-nas kernel: seccomp_set_mode_filter+0x11c/0x740 Jul 19 17:21:16 ld-nas kernel: do_seccomp+0x39/0x200 Jul 19 17:21:16 ld-nas kernel: __x64_sys_seccomp+0x1a/0x20 Jul 19 17:21:16 ld-nas kernel: do_syscall_64+0x57/0x190 Jul 19 17:21:16 ld-nas kernel: entry_SYSCALL_64_after_hwframe+0x44/0xa9 Jul 19 17:21:16 ld-nas kernel: RIP: 0033:0x7fea62dfe89d Jul 19 17:21:16 ld-nas kernel: Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d c3 f5 0> Jul 19 17:21:16 ld-nas kernel: RSP: 002b:00007ffec565caa8 EFLAGS: 00000246 ORIG_RAX: 000000000000013d Jul 19 17:21:16 ld-nas kernel: RAX: ffffffffffffffda RBX: 000055bd117d1f20 RCX: 00007fea62dfe89d Jul 19 17:21:16 ld-nas kernel: RDX: 000055bd11794d60 RSI: 0000000000000000 RDI: 0000000000000001 Jul 19 17:21:16 ld-nas kernel: RBP: 000055bd11794d60 R08: 000055bd117d1f20 R09: 00007fea62c73350 Jul 19 17:21:16 ld-nas kernel: R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000000 Jul 19 17:21:16 ld-nas kernel: R13: 00007ffec565cad0 R14: 00007fea62c73dd0 R15: 00007ffec565cf50 Jul 19 17:21:43 ld-nas kernel: rcu: INFO: rcu_sched detected expedited stalls on CPUs/tasks: { 7-... } 242419 jiffies s: 5425 root: 0x1/. Jul 19 17:21:43 ld-nas kernel: rcu: blocking rcu_node structures: l=1:0-15:0x80/. Jul 19 17:21:43 ld-nas kernel: Task dump for CPU 7: Jul 19 17:21:43 ld-nas kernel: (imesyncd) R running task 0 25166 1 0x8000000c Jul 19 17:21:43 ld-nas kernel: Call Trace: Jul 19 17:21:43 ld-nas kernel: ? bpf_int_jit_compile+0x2d1/0x340 Jul 19 17:21:43 ld-nas kernel: ? bpf_prog_select_runtime+0xa7/0x130 Jul 19 17:21:43 ld-nas kernel: ? bpf_prepare_filter+0x44c/0x4b0 Jul 19 17:21:43 ld-nas kernel: ? hardlockup_detector_perf_cleanup+0xa0/0xa0 Jul 19 17:21:43 ld-nas kernel: ? bpf_prog_create_from_user+0xc7/0x120 Jul 19 17:21:43 ld-nas kernel: ? seccomp_set_mode_filter+0x11c/0x740 Jul 19 17:21:43 ld-nas kernel: ? do_seccomp+0x39/0x200 Jul 19 17:21:43 ld-nas kernel: ? __x64_sys_seccomp+0x1a/0x20 Jul 19 17:21:43 ld-nas kernel: ? do_syscall_64+0x57/0x190 Jul 19 17:21:43 ld-nas kernel: ? entry_SYSCALL_64_after_hwframe+0x44/0xa9 Jul 19 17:21:44 ld-nas kernel: watchdog: BUG: soft lockup - CPU#7 stuck for 23s! [(imesyncd):25166] -- Reboot --
如果您需要更多資訊(關於系統或更多日誌),我很樂意為您提供。
編輯:將我的 BIOS 更新到最新版本後,系統似乎執行得更穩定,更長時間沒有故障。但是,似乎有一個新問題現在導致(另一個 CPU 的)硬鎖定。
Jul 21 00:02:36 ld-nas kernel: xhci_hcd 0000:0a:00.3: xHCI host not responding to stop endpoint command. Jul 21 00:02:36 ld-nas kernel: xhci_hcd 0000:0a:00.3: Host halt failed, -110 Jul 21 00:02:36 ld-nas kernel: xhci_hcd 0000:0a:00.3: xHCI host controller not responding, assume dead Jul 21 00:02:36 ld-nas kernel: xhci_hcd 0000:0a:00.3: HC died; cleaning up Jul 21 00:02:36 ld-nas kernel: usb 3-2: USB disconnect, device number 2 Jul 21 00:02:36 ld-nas kernel: usb 4-3: USB disconnect, device number 2 Jul 21 00:02:53 ld-nas kernel: [UFW BLOCK] IN=enp8s0 OUT= MAC=01:00:5e:00:00:01:0c:8e:29:0e:d8:60:08:00 SRC=192.168.2.1 DST=224.0.0.1 LEN=36 TOS=0x00 PREC=0xC0 TTL=1 ID=16457 PROTO=2 Jul 21 00:03:36 ld-nas systemd-udevd[617]: sdd: Worker [49959] processing SEQNUM=12062 is taking a long time Jul 21 00:03:36 ld-nas systemd-udevd[617]: hiddev0: Worker [49962] processing SEQNUM=12069 is taking a long time Jul 21 00:03:36 ld-nas systemd-udevd[617]: 0003:046D:C52B.0001: Worker [49960] processing SEQNUM=12063 is taking a long time Jul 21 00:03:53 ld-nas kernel: [UFW BLOCK] IN=enp8s0 OUT= MAC=01:00:5e:00:00:01:0c:8e:29:0e:d8:60:08:00 SRC=192.168.2.1 DST=224.0.0.1 LEN=36 TOS=0x00 PREC=0xC0 TTL=1 ID=16459 PROTO=2 Jul 21 00:04:15 ld-nas systemd[1]: systemd-logind.service: Watchdog timeout (limit 3min)! Jul 21 00:04:15 ld-nas systemd[1]: systemd-logind.service: Killing process 1134 (systemd-logind) with signal SIGABRT. Jul 21 00:04:31 ld-nas systemd[1]: systemd-resolved.service: Watchdog timeout (limit 3min)! Jul 21 00:04:31 ld-nas systemd[1]: systemd-resolved.service: Killing process 1073 (systemd-resolve) with signal SIGABRT. Jul 21 00:04:53 ld-nas kernel: [UFW BLOCK] IN=enp8s0 OUT= MAC=01:00:5e:00:00:01:0c:8e:29:0e:d8:60:08:00 SRC=192.168.2.1 DST=224.0.0.1 LEN=36 TOS=0x00 PREC=0xC0 TTL=1 ID=16461 PROTO=2 Jul 21 00:05:36 ld-nas systemd-udevd[617]: sdd: Worker [49959] processing SEQNUM=12062 killed Jul 21 00:05:36 ld-nas systemd-udevd[617]: hiddev0: Worker [49962] processing SEQNUM=12069 killed Jul 21 00:05:36 ld-nas systemd-udevd[617]: 0003:046D:C52B.0001: Worker [49960] processing SEQNUM=12063 killed Jul 21 00:05:45 ld-nas systemd[1]: systemd-logind.service: State 'stop-watchdog' timed out. Terminating. Jul 21 00:05:49 ld-nas systemd[1]: snapd.service: Watchdog timeout (limit 5min)! Jul 21 00:05:49 ld-nas systemd[1]: snapd.service: Killing process 1123 (snapd) with signal SIGABRT. Jul 21 00:05:53 ld-nas kernel: [UFW BLOCK] IN=enp8s0 OUT= MAC=01:00:5e:00:00:01:0c:8e:29:0e:d8:60:08:00 SRC=192.168.2.1 DST=224.0.0.1 LEN=36 TOS=0x00 PREC=0xC0 TTL=1 ID=16463 PROTO=2 Jul 21 00:06:02 ld-nas systemd[1]: systemd-resolved.service: State 'stop-watchdog' timed out. Terminating. Jul 21 00:06:53 ld-nas kernel: [UFW BLOCK] IN=enp8s0 OUT= MAC=01:00:5e:00:00:01:0c:8e:29:0e:d8:60:08:00 SRC=192.168.2.1 DST=224.0.0.1 LEN=36 TOS=0x00 PREC=0xC0 TTL=1 ID=16465 PROTO=2 Jul 21 00:07:15 ld-nas systemd[1]: systemd-logind.service: State 'stop-sigterm' timed out. Killing. Jul 21 00:07:15 ld-nas systemd[1]: systemd-logind.service: Killing process 1134 (systemd-logind) with signal SIGKILL. Jul 21 00:07:19 ld-nas systemd[1]: snapd.service: State 'stop-watchdog' timed out. Terminating. Jul 21 00:07:32 ld-nas systemd[1]: systemd-resolved.service: State 'stop-sigterm' timed out. Killing. Jul 21 00:07:32 ld-nas systemd[1]: systemd-resolved.service: Killing process 1073 (systemd-resolve) with signal SIGKILL. Jul 21 00:07:53 ld-nas kernel: [UFW BLOCK] IN=enp8s0 OUT= MAC=01:00:5e:00:00:01:0c:8e:29:0e:d8:60:08:00 SRC=192.168.2.1 DST=224.0.0.1 LEN=36 TOS=0x00 PREC=0xC0 TTL=1 ID=16467 PROTO=2 Jul 21 00:08:46 ld-nas systemd[1]: systemd-logind.service: Processes still around after SIGKILL. Ignoring. Jul 21 00:08:50 ld-nas systemd[1]: snapd.service: State 'stop-sigterm' timed out. Killing. Jul 21 00:08:50 ld-nas systemd[1]: snapd.service: Killing process 1123 (snapd) with signal SIGKILL. Jul 21 00:08:53 ld-nas kernel: [UFW BLOCK] IN=enp8s0 OUT= MAC=01:00:5e:00:00:01:0c:8e:29:0e:d8:60:08:00 SRC=192.168.2.1 DST=224.0.0.1 LEN=36 TOS=0x00 PREC=0xC0 TTL=1 ID=16469 PROTO=2 Jul 21 00:09:02 ld-nas systemd[1]: systemd-resolved.service: Processes still around after SIGKILL. Ignoring. Jul 21 00:09:53 ld-nas kernel: [UFW BLOCK] IN=enp8s0 OUT= MAC=01:00:5e:00:00:01:0c:8e:29:0e:d8:60:08:00 SRC=192.168.2.1 DST=224.0.0.1 LEN=36 TOS=0x00 PREC=0xC0 TTL=1 ID=16471 PROTO=2 Jul 21 00:10:16 ld-nas systemd[1]: systemd-logind.service: State 'stop-final-sigterm' timed out. Killing. Jul 21 00:10:16 ld-nas systemd[1]: systemd-logind.service: Killing process 1134 (systemd-logind) with signal SIGKILL. Jul 21 00:10:20 ld-nas systemd[1]: snapd.service: Processes still around after SIGKILL. Ignoring. Jul 21 00:10:32 ld-nas systemd[1]: systemd-resolved.service: State 'stop-final-sigterm' timed out. Killing. Jul 21 00:10:32 ld-nas systemd[1]: systemd-resolved.service: Killing process 1073 (systemd-resolve) with signal SIGKILL. Jul 21 00:10:53 ld-nas kernel: [UFW BLOCK] IN=enp8s0 OUT= MAC=01:00:5e:00:00:01:0c:8e:29:0e:d8:60:08:00 SRC=192.168.2.1 DST=224.0.0.1 LEN=36 TOS=0x00 PREC=0xC0 TTL=1 ID=16474 PROTO=2 Jul 21 00:11:46 ld-nas systemd[1]: systemd-logind.service: Processes still around after final SIGKILL. Entering failed mode. Jul 21 00:11:46 ld-nas systemd[1]: systemd-logind.service: Failed with result 'watchdog'. Jul 21 00:11:46 ld-nas systemd[1]: systemd-logind.service: Scheduled restart job, restart counter is at 1. Jul 21 00:11:46 ld-nas systemd[1]: Stopped Login Service. Jul 21 00:11:46 ld-nas systemd[1]: Condition check resulted in Load Kernel Module drm being skipped. Jul 21 00:11:46 ld-nas systemd[1]: systemd-logind.service: Found left-over process 1134 (systemd-logind) in control group while starting unit. Ignoring. Jul 21 00:11:46 ld-nas systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies. Jul 21 00:11:46 ld-nas systemd[1]: Starting Login Service... Jul 21 00:11:50 ld-nas systemd[1]: snapd.service: State 'stop-final-sigterm' timed out. Killing. Jul 21 00:11:50 ld-nas systemd[1]: snapd.service: Killing process 1123 (snapd) with signal SIGKILL. Jul 21 00:11:53 ld-nas kernel: [UFW BLOCK] IN=enp8s0 OUT= MAC=01:00:5e:00:00:01:0c:8e:29:0e:d8:60:08:00 SRC=192.168.2.1 DST=224.0.0.1 LEN=36 TOS=0x00 PREC=0xC0 TTL=1 ID=16475 PROTO=2 Jul 21 00:12:03 ld-nas systemd[1]: systemd-resolved.service: Processes still around after final SIGKILL. Entering failed mode. Jul 21 00:12:03 ld-nas systemd[1]: systemd-resolved.service: Failed with result 'watchdog'. Jul 21 00:12:03 ld-nas systemd[1]: systemd-resolved.service: Scheduled restart job, restart counter is at 1. Jul 21 00:12:03 ld-nas systemd[1]: Stopped Network Name Resolution. Jul 21 00:12:03 ld-nas systemd[1]: systemd-resolved.service: Found left-over process 1073 (systemd-resolve) in control group while starting unit. Ignoring. Jul 21 00:12:03 ld-nas systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies. Jul 21 00:12:03 ld-nas systemd[1]: Starting Network Name Resolution... Jul 21 00:13:17 ld-nas systemd[1]: systemd-logind.service: start operation timed out. Terminating. Jul 21 00:13:20 ld-nas systemd[1]: snapd.service: Processes still around after final SIGKILL. Entering failed mode. Jul 21 00:13:20 ld-nas systemd[1]: snapd.service: Failed with result 'watchdog'. Jul 21 00:13:21 ld-nas systemd[1]: snapd.service: Scheduled restart job, restart counter is at 1. Jul 21 00:13:21 ld-nas systemd[1]: Stopped Snap Daemon. Jul 21 00:13:21 ld-nas systemd[1]: snapd.service: Found left-over process 1123 (snapd) in control group while starting unit. Ignoring. Jul 21 00:13:21 ld-nas systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies. Jul 21 00:13:21 ld-nas systemd[1]: Starting Snap Daemon... Jul 21 00:13:21 ld-nas systemd-udevd[617]: Worker [49959] terminated by signal 9 (KILL) Jul 21 00:13:21 ld-nas systemd-udevd[617]: sdd: Worker [49959] failed Jul 21 00:13:33 ld-nas systemd[1]: systemd-resolved.service: start operation timed out. Terminating. Jul 21 00:14:47 ld-nas systemd[1]: systemd-logind.service: State 'stop-sigterm' timed out. Killing. Jul 21 00:14:47 ld-nas systemd[1]: systemd-logind.service: Killing process 1134 (systemd-logind) with signal SIGKILL. Jul 21 00:14:51 ld-nas systemd[1]: snapd.service: start operation timed out. Terminating. Jul 21 00:14:51 ld-nas systemd[1]: snapd.service: Failed with result 'timeout'. Jul 21 00:14:51 ld-nas systemd[1]: Failed to start Snap Daemon. Jul 21 00:14:51 ld-nas systemd[1]: snapd.service: Scheduled restart job, restart counter is at 2. Jul 21 00:14:51 ld-nas systemd[1]: Stopped Snap Daemon. Jul 21 00:14:51 ld-nas systemd[1]: snapd.service: Found left-over process 1123 (snapd) in control group while starting unit. Ignoring. Jul 21 00:14:51 ld-nas systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies. Jul 21 00:14:51 ld-nas systemd[1]: snapd.service: Found left-over process 50071 (snapd) in control group while starting unit. Ignoring. Jul 21 00:14:51 ld-nas systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies. Jul 21 00:14:51 ld-nas systemd[1]: Starting Snap Daemon... Jul 21 00:15:03 ld-nas systemd[1]: systemd-resolved.service: State 'stop-sigterm' timed out. Killing. Jul 21 00:15:03 ld-nas systemd[1]: systemd-resolved.service: Killing process 1073 (systemd-resolve) with signal SIGKILL. Jul 21 00:16:17 ld-nas systemd[1]: systemd-logind.service: Processes still around after SIGKILL. Ignoring. Jul 21 00:16:21 ld-nas systemd[1]: snapd.service: start operation timed out. Terminating. Jul 21 00:16:21 ld-nas systemd[1]: snapd.service: Failed with result 'timeout'. Jul 21 00:16:21 ld-nas systemd[1]: Failed to start Snap Daemon. Jul 21 00:16:22 ld-nas systemd[1]: snapd.service: Scheduled restart job, restart counter is at 3. Jul 21 00:16:22 ld-nas systemd[1]: Stopped Snap Daemon. Jul 21 00:16:22 ld-nas systemd[1]: snapd.service: Found left-over process 1123 (snapd) in control group while starting unit. Ignoring. Jul 21 00:16:22 ld-nas systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies. Jul 21 00:16:22 ld-nas systemd[1]: snapd.service: Found left-over process 50079 (snapd) in control group while starting unit. Ignoring. Jul 21 00:16:22 ld-nas systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies. Jul 21 00:16:22 ld-nas systemd[1]: Starting Snap Daemon... Jul 21 00:16:34 ld-nas systemd[1]: systemd-resolved.service: Processes still around after SIGKILL. Ignoring. Jul 21 00:17:47 ld-nas systemd[1]: systemd-logind.service: State 'stop-final-sigterm' timed out. Killing. Jul 21 00:17:47 ld-nas systemd[1]: systemd-logind.service: Killing process 1134 (systemd-logind) with signal SIGKILL. Jul 21 00:17:52 ld-nas systemd[1]: snapd.service: start operation timed out. Terminating. Jul 21 00:17:52 ld-nas systemd[1]: snapd.service: Failed with result 'timeout'. Jul 21 00:17:52 ld-nas systemd[1]: Failed to start Snap Daemon. Jul 21 00:17:52 ld-nas systemd[1]: snapd.service: Scheduled restart job, restart counter is at 4. Jul 21 00:17:52 ld-nas systemd[1]: Stopped Snap Daemon. Jul 21 00:17:52 ld-nas systemd[1]: snapd.service: Found left-over process 1123 (snapd) in control group while starting unit. Ignoring. Jul 21 00:17:52 ld-nas systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies. Jul 21 00:17:52 ld-nas systemd[1]: snapd.service: Found left-over process 50087 (snapd) in control group while starting unit. Ignoring. Jul 21 00:17:52 ld-nas systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies. Jul 21 00:17:52 ld-nas systemd[1]: Starting Snap Daemon... Jul 21 00:18:04 ld-nas systemd[1]: systemd-resolved.service: State 'stop-final-sigterm' timed out. Killing. Jul 21 00:18:04 ld-nas systemd[1]: systemd-resolved.service: Killing process 1073 (systemd-resolve) with signal SIGKILL. Jul 21 00:19:18 ld-nas systemd[1]: systemd-logind.service: Processes still around after final SIGKILL. Entering failed mode. Jul 21 00:19:18 ld-nas systemd[1]: systemd-logind.service: Failed with result 'timeout'. Jul 21 00:19:18 ld-nas systemd[1]: Failed to start Login Service. Jul 21 00:19:18 ld-nas systemd[1]: systemd-logind.service: Scheduled restart job, restart counter is at 2. Jul 21 00:19:18 ld-nas systemd[1]: Stopped Login Service. Jul 21 00:19:18 ld-nas systemd[1]: Condition check resulted in Load Kernel Module drm being skipped. Jul 21 00:19:18 ld-nas systemd[1]: systemd-logind.service: Found left-over process 1134 (systemd-logind) in control group while starting unit. Ignoring. Jul 21 00:19:18 ld-nas systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies. Jul 21 00:19:18 ld-nas systemd[1]: Starting Login Service... Jul 21 00:19:22 ld-nas systemd[1]: snapd.service: start operation timed out. Terminating. Jul 21 00:19:22 ld-nas systemd[1]: snapd.service: Failed with result 'timeout'. Jul 21 00:19:22 ld-nas systemd[1]: Failed to start Snap Daemon. Jul 21 00:19:23 ld-nas systemd[1]: snapd.service: Scheduled restart job, restart counter is at 5. Jul 21 00:19:23 ld-nas systemd[1]: Stopped Snap Daemon. Jul 21 00:19:23 ld-nas systemd[1]: snapd.service: Found left-over process 1123 (snapd) in control group while starting unit. Ignoring. Jul 21 00:19:23 ld-nas systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies. Jul 21 00:19:23 ld-nas systemd[1]: snapd.service: Found left-over process 50095 (snapd) in control group while starting unit. Ignoring. Jul 21 00:19:23 ld-nas systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies. Jul 21 00:19:23 ld-nas systemd[1]: Starting Snap Daemon... Jul 21 00:19:23 ld-nas snapd[50100]: AppArmor status: apparmor is enabled and all features are available Jul 21 00:19:34 ld-nas systemd[1]: systemd-resolved.service: Processes still around after final SIGKILL. Entering failed mode. Jul 21 00:19:34 ld-nas systemd[1]: systemd-resolved.service: Failed with result 'timeout'. Jul 21 00:19:34 ld-nas systemd[1]: Failed to start Network Name Resolution. Jul 21 00:19:34 ld-nas systemd[1]: systemd-resolved.service: Scheduled restart job, restart counter is at 2. Jul 21 00:19:34 ld-nas systemd[1]: Stopped Network Name Resolution. Jul 21 00:19:34 ld-nas systemd[1]: systemd-resolved.service: Found left-over process 1073 (systemd-resolve) in control group while starting unit. Ignoring. Jul 21 00:19:34 ld-nas systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies. Jul 21 00:19:34 ld-nas systemd[1]: Starting Network Name Resolution... Jul 21 00:20:48 ld-nas systemd[1]: systemd-logind.service: start operation timed out. Terminating. Jul 21 00:20:53 ld-nas systemd[1]: snapd.service: start operation timed out. Terminating. Jul 21 00:20:53 ld-nas systemd[1]: snapd.service: Failed with result 'timeout'. Jul 21 00:20:53 ld-nas systemd[1]: Failed to start Snap Daemon. Jul 21 00:20:53 ld-nas systemd[1]: snapd.service: Scheduled restart job, restart counter is at 6. Jul 21 00:20:53 ld-nas systemd[1]: Stopped Snap Daemon. Jul 21 00:20:53 ld-nas systemd[1]: snapd.service: Found left-over process 1123 (snapd) in control group while starting unit. Ignoring. Jul 21 00:20:53 ld-nas systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies. Jul 21 00:20:53 ld-nas systemd[1]: snapd.service: Found left-over process 50109 (snapd) in control group while starting unit. Ignoring. Jul 21 00:20:53 ld-nas systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies. Jul 21 00:20:53 ld-nas systemd[1]: Starting Snap Daemon... Jul 21 00:20:53 ld-nas snapd[50115]: AppArmor status: apparmor is enabled and all features are available Jul 21 00:04:29 ld-nas kernel: NMI watchdog: Watchdog detected hard LOCKUP on cpu 12 Jul 21 00:04:29 ld-nas kernel: Modules linked in: veth xt_nat xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo iptable_nat nf_nat aufs quota_v2 quota_tree nls_iso8859_1 dm_multipath scsi_dh_rd> Jul 21 00:04:29 ld-nas kernel: hid_generic usbhid hid nouveau crct10dif_pclmul mxm_wmi crc32_pclmul video ghash_clmulni_intel i2c_algo_bit ttm drm_kms_helper aesni_intel syscopyarea sysfillrect crypto_simd s> Jul 21 00:04:29 ld-nas kernel: CPU: 12 PID: 50109 Comm: systemd-detect- Not tainted 5.4.0-77-generic #86-Ubuntu Jul 21 00:04:29 ld-nas kernel: Hardware name: Gigabyte Technology Co., Ltd. B450M S2H/B450M S2H, BIOS F61c 05/10/2021 Jul 21 00:04:29 ld-nas kernel: RIP: 0010:smp_call_function_single+0x9b/0x110 Jul 21 00:04:29 ld-nas kernel: Code: 65 8b 05 90 81 6d 64 a9 00 01 1f 00 75 79 85 c9 75 40 48 c7 c6 c0 bc 02 00 65 48 03 35 46 19 6d 64 8b 46 18 a8 01 74 09 f3 90 <8b> 46 18 a8 01 75 f7 83 4e 18 01 4c 89 c9 4> Jul 21 00:04:29 ld-nas kernel: RSP: 0018:ffffb4c60448fba0 EFLAGS: 00000202 Jul 21 00:04:29 ld-nas kernel: RAX: 0000000000000001 RBX: 0000010da42e2b19 RCX: 0000000000000000 Jul 21 00:04:29 ld-nas kernel: RDX: 0000000000000000 RSI: ffff8d02eeb2bcc0 RDI: 0000000000000001 Jul 21 00:04:29 ld-nas kernel: RBP: ffffb4c60448fbe8 R08: ffffffff9b846090 R09: 0000000000000000 Jul 21 00:04:29 ld-nas kernel: R10: 0000000000000001 R11: 006f666e69757063 R12: 0000000000000001 Jul 21 00:04:29 ld-nas kernel: R13: 00002b8a84bbd593 R14: 0000000000000001 R15: ffff8d02e3fd5f00 Jul 21 00:04:29 ld-nas kernel: FS: 00007fde6214c980(0000) GS:ffff8d02eeb00000(0000) knlGS:0000000000000000 Jul 21 00:04:29 ld-nas kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Jul 21 00:04:29 ld-nas kernel: CR2: 0000562bbe2c2d98 CR3: 00000002577a6000 CR4: 00000000003406e0 Jul 21 00:04:29 ld-nas kernel: Call Trace: Jul 21 00:04:29 ld-nas kernel: ? ktime_get+0x3e/0xa0 Jul 21 00:04:29 ld-nas kernel: aperfmperf_snapshot_cpu+0x42/0x50 Jul 21 00:04:29 ld-nas kernel: arch_freq_prepare_all+0x67/0xa0 Jul 21 00:04:29 ld-nas kernel: cpuinfo_open+0x13/0x30 Jul 21 00:04:29 ld-nas kernel: proc_reg_open+0x77/0x130 Jul 21 00:04:29 ld-nas kernel: ? proc_put_link+0x10/0x10 Jul 21 00:04:29 ld-nas kernel: do_dentry_open+0x143/0x3a0 Jul 21 00:04:29 ld-nas kernel: vfs_open+0x2d/0x30 Jul 21 00:04:29 ld-nas kernel: do_last+0x194/0x900 Jul 21 00:04:29 ld-nas kernel: path_openat+0x8d/0x290 Jul 21 00:04:29 ld-nas kernel: ? putname+0x4a/0x50 Jul 21 00:04:29 ld-nas kernel: do_filp_open+0x91/0x100 Jul 21 00:04:29 ld-nas kernel: ? __alloc_fd+0x46/0x150 Jul 21 00:04:29 ld-nas kernel: do_sys_open+0x17e/0x290 Jul 21 00:04:29 ld-nas kernel: __x64_sys_openat+0x20/0x30 Jul 21 00:04:29 ld-nas kernel: do_syscall_64+0x57/0x190 Jul 21 00:04:29 ld-nas kernel: entry_SYSCALL_64_after_hwframe+0x44/0xa9 Jul 21 00:04:29 ld-nas kernel: RIP: 0033:0x7fde62ff9eab Jul 21 00:04:29 ld-nas kernel: Code: 25 00 00 41 00 3d 00 00 41 00 74 4b 64 8b 04 25 18 00 00 00 85 c0 75 67 44 89 e2 48 89 ee bf 9c ff ff ff b8 01 01 00 00 0f 05 <48> 3d 00 f0 ff ff 0f 87 91 00 00 00 48 8b 4> Jul 21 00:04:29 ld-nas kernel: RSP: 002b:00007ffeaa449770 EFLAGS: 00000246 ORIG_RAX: 0000000000000101 Jul 21 00:04:29 ld-nas kernel: RAX: ffffffffffffffda RBX: 0000562bbe2c12d0 RCX: 00007fde62ff9eab Jul 21 00:04:29 ld-nas kernel: RDX: 0000000000080000 RSI: 00007fde62e6b227 RDI: 00000000ffffff9c Jul 21 00:04:29 ld-nas kernel: RBP: 00007fde62e6b227 R08: 0000000000000008 R09: 0000000000000001 Jul 21 00:04:29 ld-nas kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000080000 Jul 21 00:04:29 ld-nas kernel: R13: 00007fde62e92e21 R14: 00007fde62e6b869 R15: 00007fde62e6b88c
Ryzen 和 Linux 存在已知問題。你 ssh 進入這台機器了嗎?如果你用Google搜尋“ryzen linux soft lockup”,有數百個關於系統凍結並需要重新啟動的執行緒,但沒有一個執行緒提到間歇性網路連接作為症狀。
該執行緒解釋了添加
processor.max_cstate=5 rcu_nocbs=0-15
到您的核心的引導選項可能會解決此問題。
此錯誤報告看起來與您的問題相同。相同的 CPU 和相同的核心。無論如何,一件好事是更新您的 BIOS(您的是 2019 年的)。一些人聲稱這是 CPU 空閒時的電源問題,並建議嘗試使用 BIOS 中的任何電源設置來解決鎖定問題。
如果所有其他方法都失敗了,最後要嘗試的是與 AMD 聯繫,看看他們是否會給你 RMA。