ntp 伺服器可訪問但從不選擇/設置時間
我們有一些嵌入式設備使用 ntpd(4.2.8p10) 來同步時間。我們的一位客戶在內部網路中使用自己的 ntp 伺服器。從 ntpd -dgq 調試模式,我們發現伺服器是可達的,我們可以得到偏移量、延遲和抖動資訊。但是,ntpd 只會以“ ntpd: no servers found ”退出,並且永遠不會選擇和設置本地時間。
2 Nov 11:57:05 ntpd[20218]: ntpd 4.2.8p10@1.3728-o Thu Jul 26 19:52:20 UTC 2018 (2): Starting 2 Nov 11:57:05 ntpd[20218]: Command line: ntpd -dgq 2 Nov 11:57:05 ntpd[20218]: proto: precision = 2.000 usec (-19) Finished Parsing!! restrict: op 1 addr 0.0.0.0 mask 0.0.0.0 mflags 00000000 flags 000005f0 restrict: op 1 addr 127.0.0.1 mask 255.255.255.255 mflags 00000000 flags 00000000 restrict source template mflags 4000 flags 1c0 restrict: op 1 addr (null) mask (null) mflags 00004000 flags 000001c0 move_fd: estimated max descriptors: 1024, initial socket boundary: 16 2 Nov 11:57:05 ntpd[20218]: Listen and drop on 0 v4wildcard 0.0.0.0:123 2 Nov 11:57:05 ntpd[20218]: Listen normally on 1 lo 127.0.0.1:123 restrict: op 1 addr 127.0.0.1 mask 255.255.255.255 mflags 00003000 flags 00000001 2 Nov 11:57:05 ntpd[20218]: Listen normally on 2 eth1 192.168.168.109:123 restrict: op 1 addr 192.168.168.109 mask 255.255.255.255 mflags 00003000 flags 00000001 2 Nov 11:57:05 ntpd[20218]: Listen normally on 3 wlan0 192.168.100.1:123 restrict: op 1 addr 192.168.100.1 mask 255.255.255.255 mflags 00003000 flags 00000001 2 Nov 11:57:05 ntpd[20218]: Listening on routing socket on fd #27 for interface updates key_expire: at 0 associd 60163 peer_clear: at 0 next 1 associd 60163 refid INIT restrict: op 1 addr 10.160.129.161 mask 255.255.255.255 mflags 00004000 flags 000001c0 restrict_source: 10.160.129.161 host restriction added event at 0 10.160.129.161 8011 81 mobilize assoc 60163 newpeer: 192.168.168.109->10.160.129.161 mode 3 vers 4 poll 6 10 flags 0x101 0x1 ttl 0 key 00000000 event at 0 0.0.0.0 c016 06 restart peer_xmit: at 1 192.168.168.109->10.160.129.161 mode 3 len 48 xmt 0xe52bde52.ddf3c87c auth_agekeys: at 1 keys 0 expired 0 event at 1 10.160.129.161 8014 84 reachable clock_filter: n 1 off 30.082946 del 0.048598 dsp 7.945314 jit 0.000002 peer_xmit: at 3 192.168.168.109->10.160.129.161 mode 3 len 48 xmt 0xe52bde54.ddf0a416 clock_filter: n 2 off 30.083616 del 0.047583 dsp 3.949228 jit 0.000670 peer_xmit: at 5 192.168.168.109->10.160.129.161 mode 3 len 48 xmt 0xe52bde56.dde968ab clock_filter: n 3 off 30.078398 del 0.054469 dsp 1.951189 jit 0.004895 peer_xmit: at 7 192.168.168.109->10.160.129.161 mode 3 len 48 xmt 0xe52bde58.dde80026 clock_filter: n 4 off 30.079499 del 0.074539 dsp 0.952172 jit 0.003164 peer_xmit: at 9 192.168.168.109->10.160.129.161 mode 3 len 48 xmt 0xe52bde5a.ddea03c8 clock_filter: n 5 off 30.083616 del 0.044472 dsp 0.452664 jit 0.003340 2 Nov 11:57:16 ntpd[20218]: ntpd: no servers found END OF FILE
此外,在後台執行 ntpd 並使用ntpq -p查詢 ntpd 狀態時。我們得到以下結果,st、delay、offset 和reach 看起來都很好。
root@S8P20092901:~# ntpq -c as ind assid status conf reach auth condition last_event cnt =========================================================== 1 59609 9014 yes yes none reject reachable 1 root@S8P20092901:~# ntpq -np remote refid st t when poll reach delay offset jitter ============================================================================== 10.160.129.161 162.159.200.123 4 u 24 64 377 40.404 -180.122 20.122
但是,ntpd 從不選擇 ntp 伺服器作為時間源(從不在遠端地址前顯示“*”或“+”)或在長時間等待後設置本地時間。
我查看了原始碼。當使用 ntpdate(-q) 模式時,當沒有選擇/設置時鐘時,ntpd 將在為每個伺服器執行所有突發後退出
} else { peer->burst--; if (peer->burst == 0) { /* * If ntpdate mode and the clock has not been * set and all peers have completed the burst, * we declare a successful failure. */ if (mode_ntpdate) { peer_ntpdate--; if (peer_ntpdate == 0) { msyslog(LOG_NOTICE, "ntpd: no servers found"); if (!msyslog_term) printf( "ntpd: no servers found\n"); exit (0); } } } }
但是,我仍然不明白為什麼 ntpd 沒有從伺服器中選擇和設置時間。提前感謝您的幫助。
這看起來可能是根分散問題(從時間源到伺服器的累積誤差)。
您已經提供
ntpq -nc associations
:ind assid status conf reach auth condition last_event cnt =========================================================== 1 59609 9014 yes yes none reject reachable 1
所以現在需要的是顯示這個有問題的關聯的細節:
ntpq -nc 'readvar 59609'
你應該得到這樣的東西(取自我自己的 NTP 伺服器)
associd=33428 status=142a reach, sel_candidate, 2 events, sys_peer, srcadr=90.255.244.219, srcport=123, dstadr=192.168.1.18, dstport=123, leap=00, stratum=1, precision=-20, rootdelay=0.000, rootdisp=1.511, refid=PPS, reftime=e53ca0fb.4d946a30 Mon, Nov 15 2021 9:03:55.303, rec=e53ca11e.bf1413cd Mon, Nov 15 2021 9:04:30.746, reach=377, unreach=0, hmode=3, pmode=4, hpoll=10, ppoll=10, headway=0, flash=00 ok, keyid=0, offset=-0.249, delay=22.177, dispersion=55.975, jitter=56.489, xleave=0.088, filtdelay= 157.46 161.45 169.05 22.18 21.68 21.76 186.40 22.04, filtoffset= 70.21 70.72 74.51 -0.25 -0.03 -0.26 81.90 -0.34, filtdisp= 0.00 15.39 31.02 47.04 63.23 79.22 86.97 94.76
尋找
rootdisp
價值。我希望你會發現你的高,說明從時間源到這裡的路徑錯誤太多。除了使用不同的上游伺服器之外,您對此無能為力。(您可以修復,maxdisp
但如果必須這樣做,您必須詢問上游伺服器的可靠性。)參考:
- 思科 -對 Microsoft Windows 上的 ISE 和 NTP 伺服器同步故障進行故障排除(PDF)
- NTP -參考文件
ntpq
- ServerFault -為什麼 NTP 認為我的伺服器不足?
從這篇文章開始,我遇到了同樣的問題。
解決方案確實是添加
tos maxdist 30
到/etc/ntp.conf
下面我列出了檢查和解決它的所有步驟。請注意,只有在沒有其他時間伺服器選項的情況下才應執行此操作:正如其他人所說,這也意味著上游 NTP 伺服器並不真正可靠。以下是步驟:
如果你使用
ntpd -dgq
,你可能會得到一個unable to bind to wildcard address
錯誤。因此,在執行它之前,您需要停止 NTP 服務service ntp stop
或終止持有 NTP 的程序:lsof -i | grep ntp kill <pid>
之後,執行
ntpd -dgq
命令。如果你得到了日誌的最後一部分,那麼 NTP 伺服器是不可訪問的:... ... ... receive: MATCH_ASSOC dispatch: mode 4/server:AM_PROCPKT filegen 2 3854076120 clock_filter: n 5 off 3.839496 del 0.000455 dsp 0.437525 jit 0.000248 17 Feb 09:42:02 ntpd[1040]: ntpd: no servers found
此外,重新啟動 NTP 服務 (
service ntp start
) 後,可以使用以下命令看到相同的內容 - 伺服器可訪問,但無法執行時間同步:root@akulab1:~# ntpq -c as ind assid status conf reach auth condition last_event cnt =========================================================== 1 34463 9014 yes yes none reject reachable 1 root@akulab1:~# ntpq -np remote refid st t when poll reach delay offset jitter ============================================================================== 172.16.0.25 .LOCL. 1 u 36 64 7 0.579 3917.57 5.842
如前所述,原因是
rootdisp
下面的輸出值很大(使用assid
fromntpq -c as
作為輸入readvar
):root@akulab1:~# ntpq -nc 'readvar 34463' associd=34463 status=9014 conf, reach, sel_reject, 1 event, reachable, srcadr=172.16.0.25, srcport=123, dstadr=172.16.0.133, dstport=123, leap=00, stratum=1, precision=-23, rootdelay=0.000, rootdisp=10684.280, refid=LOCL, reftime=e5b7a483.b4d87c2d Wed, Feb 16 2022 17:27:47.706, rec=e5b88b72.a5c64a66 Thu, Feb 17 2022 9:53:06.647, reach=377, unreach=0, hmode=3, pmode=4, hpoll=6, ppoll=6, headway=290, flash=400 peer_dist, keyid=0, offset=3934.131, delay=0.516, dispersion=0.987, jitter=14.653, xleave=0.037, filtdelay= 0.52 0.52 0.56 0.54 0.54 0.58 0.45 0.49, filtoffset= 3934.13 3930.84 3927.44 3924.11 3920.90 3917.58 3914.35 3911.63, filtdisp= 0.00 1.02 2.06 3.08 4.10 5.12 6.11 6.95
最後,這些是添加和重新啟動 NTP 服務的
tos maxdist 30
命令/etc/ntp.conf
:echo 'tos maxdist 30' >> /etc/ntp.conf service ntp restart
而且,瞧 - 時間已成功與您的 NTP 伺服器同步:
root@akulab1:~# ntpq -c as ind assid status conf reach auth condition last_event cnt =========================================================== 1 60446 961a yes yes none sys.peer sys_peer 1 root@akulab1:~# ntpq -np remote refid st t when poll reach delay offset jitter ============================================================================== *172.16.0.25 .LOCL. 1 u 15 64 1 0.432 0.314 0.171