Linux

ntp 伺服器可訪問但從不選擇/設置時間

  • February 17, 2022

我們有一些嵌入式設備使用 ntpd(4.2.8p10) 來同步時間。我們的一位客戶在內部網路中使用自己的 ntp 伺服器。從 ntpd -dgq 調試模式,我們發現伺服器是可達的,我們可以得到偏移量、延遲和抖動資訊。但是,ntpd 只會以“ ntpd: no servers found ”退出,並且永遠不會選擇和設置本地時間。

2 Nov 11:57:05 ntpd[20218]: ntpd 4.2.8p10@1.3728-o Thu Jul 26 19:52:20 UTC 2018 (2): Starting
2 Nov 11:57:05 ntpd[20218]: Command line: ntpd -dgq
2 Nov 11:57:05 ntpd[20218]: proto: precision = 2.000 usec (-19)
Finished Parsing!!
restrict: op 1 addr 0.0.0.0 mask 0.0.0.0 mflags 00000000 flags 000005f0
restrict: op 1 addr 127.0.0.1 mask 255.255.255.255 mflags 00000000 flags 00000000
restrict source template mflags 4000 flags 1c0
restrict: op 1 addr (null) mask (null) mflags 00004000 flags 000001c0
move_fd: estimated max descriptors: 1024, initial socket boundary: 16
2 Nov 11:57:05 ntpd[20218]: Listen and drop on 0 v4wildcard 0.0.0.0:123
2 Nov 11:57:05 ntpd[20218]: Listen normally on 1 lo 127.0.0.1:123
restrict: op 1 addr 127.0.0.1 mask 255.255.255.255 mflags 00003000 flags 00000001
2 Nov 11:57:05 ntpd[20218]: Listen normally on 2 eth1 192.168.168.109:123
restrict: op 1 addr 192.168.168.109 mask 255.255.255.255 mflags 00003000 flags 00000001
2 Nov 11:57:05 ntpd[20218]: Listen normally on 3 wlan0 192.168.100.1:123
restrict: op 1 addr 192.168.100.1 mask 255.255.255.255 mflags 00003000 flags 00000001
2 Nov 11:57:05 ntpd[20218]: Listening on routing socket on fd #27 for interface updates
key_expire: at 0 associd 60163
peer_clear: at 0 next 1 associd 60163 refid INIT
restrict: op 1 addr 10.160.129.161 mask 255.255.255.255 mflags 00004000 flags 000001c0
restrict_source: 10.160.129.161 host restriction added
event at 0 10.160.129.161 8011 81 mobilize assoc 60163
newpeer: 192.168.168.109->10.160.129.161 mode 3 vers 4 poll 6 10 flags 0x101 0x1 ttl 0 key 00000000
event at 0 0.0.0.0 c016 06 restart
peer_xmit: at 1 192.168.168.109->10.160.129.161 mode 3 len 48 xmt 0xe52bde52.ddf3c87c
auth_agekeys: at 1 keys 0 expired 0
event at 1 10.160.129.161 8014 84 reachable
clock_filter: n 1 off 30.082946 del 0.048598 dsp 7.945314 jit 0.000002
peer_xmit: at 3 192.168.168.109->10.160.129.161 mode 3 len 48 xmt 0xe52bde54.ddf0a416
clock_filter: n 2 off 30.083616 del 0.047583 dsp 3.949228 jit 0.000670
peer_xmit: at 5 192.168.168.109->10.160.129.161 mode 3 len 48 xmt 0xe52bde56.dde968ab
clock_filter: n 3 off 30.078398 del 0.054469 dsp 1.951189 jit 0.004895
peer_xmit: at 7 192.168.168.109->10.160.129.161 mode 3 len 48 xmt 0xe52bde58.dde80026
clock_filter: n 4 off 30.079499 del 0.074539 dsp 0.952172 jit 0.003164
peer_xmit: at 9 192.168.168.109->10.160.129.161 mode 3 len 48 xmt 0xe52bde5a.ddea03c8
clock_filter: n 5 off 30.083616 del 0.044472 dsp 0.452664 jit 0.003340
2 Nov 11:57:16 ntpd[20218]: ntpd: no servers found
END OF FILE

此外,在後台執行 ntpd 並使用ntpq -p查詢 ntpd 狀態時。我們得到以下結果,st、delay、offset 和reach 看起來都很好。

root@S8P20092901:~# ntpq -c as

ind assid status  conf reach auth condition  last_event cnt
===========================================================
 1 59609  9014   yes   yes  none    reject   reachable  1

root@S8P20092901:~# ntpq -np
    remote           refid      st t when poll reach   delay    offset  jitter
==============================================================================
10.160.129.161  162.159.200.123  4 u  24   64   377    40.404    -180.122   20.122

但是,ntpd 從不選擇 ntp 伺服器作為時間源(從不在遠端地址前顯示“*”或“+”)或在長時間等待後設置本地時間。

我查看了原始碼。當使用 ntpdate(-q) 模式時,當沒有選擇/設置時鐘時,ntpd 將在為每個伺服器執行所有突發後退出

   } else {
       peer->burst--;
       if (peer->burst == 0) {

           /*
            * If ntpdate mode and the clock has not been
            * set and all peers have completed the burst,
            * we declare a successful failure.
            */
           if (mode_ntpdate) {
               peer_ntpdate--;
               if (peer_ntpdate == 0) {
                   msyslog(LOG_NOTICE,
                       "ntpd: no servers found");
                   if (!msyslog_term)
                       printf(
                           "ntpd: no servers found\n");
                   exit (0);
               }
           }
       }
   }

但是,我仍然不明白為什麼 ntpd 沒有從伺服器中選擇和設置時間。提前感謝您的幫助。

這看起來可能是根分散問題(從時間源到伺服器的累積誤差)。

您已經提供ntpq -nc associations

ind assid status  conf reach auth condition  last_event cnt
===========================================================
  1 ​59609  9014   yes   yes  none    reject   reachable  1

所以現在需要的是顯示這個有問題的關聯的細節:

ntpq -nc 'readvar 59609'

你應該得到這樣的東西(取自我自己的 NTP 伺服器)

associd=33428 status=142a reach, sel_candidate, 2 events, sys_peer,
srcadr=90.255.244.219, srcport=123, dstadr=192.168.1.18, dstport=123,
leap=00, stratum=1, precision=-20, rootdelay=0.000, rootdisp=1.511,
refid=PPS, reftime=e53ca0fb.4d946a30  Mon, Nov 15 2021  9:03:55.303,
rec=e53ca11e.bf1413cd  Mon, Nov 15 2021  9:04:30.746, reach=377,
unreach=0, hmode=3, pmode=4, hpoll=10, ppoll=10, headway=0, flash=00 ok,
keyid=0, offset=-0.249, delay=22.177, dispersion=55.975, jitter=56.489,
xleave=0.088,
filtdelay=   157.46  161.45  169.05   22.18   21.68   21.76  186.40   22.04,
filtoffset=   70.21   70.72   74.51   -0.25   -0.03   -0.26   81.90   -0.34,
filtdisp=      0.00   15.39   31.02   47.04   63.23   79.22   86.97   94.76

尋找rootdisp價值。我希望你會發現你的高,說明從時間源到這裡的路徑錯誤太多。除了使用不同的上游伺服器之外,您對此無能為力。(您可以修復,maxdisp但如果必須這樣做,您必須詢問上游伺服器的可靠性。)

參考:

從這篇文章開始,我遇到了同樣的問題。

解決方案確實是添加tos maxdist 30/etc/ntp.conf下面我列出了檢查和解決它的所有步驟。請注意,只有在沒有其他時間伺服器選項的情況下才應執行此操作:正如其他人所說,這也意味著上游 NTP 伺服器並不真正可靠。

以下是步驟:

如果你使用ntpd -dgq,你可能會得到一個unable to bind to wildcard address 錯誤。因此,在執行它之前,您需要停止 NTP 服務service ntp stop或終止持有 NTP 的程序:

lsof -i | grep ntp
kill <pid>

之後,執行ntpd -dgq命令。如果你得到了日誌的最後一部分,那麼 NTP 伺服器是不可訪問的:

...
...
...
receive: MATCH_ASSOC dispatch: mode 4/server:AM_PROCPKT
filegen  2 3854076120
clock_filter: n 5 off 3.839496 del 0.000455 dsp 0.437525 jit 0.000248
17 Feb 09:42:02 ntpd[1040]: ntpd: no servers found

此外,重新啟動 NTP 服務 ( service ntp start) 後,可以使用以下命令看到相同的內容 - 伺服器可訪問,但無法執行時間同步:

root@akulab1:~# ntpq -c as
ind assid status  conf reach auth condition  last_event cnt
===========================================================
 1 34463  9014   yes   yes  none    reject   reachable  1
root@akulab1:~# ntpq -np
    remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
172.16.0.25     .LOCL.           1 u   36   64    7    0.579  3917.57   5.842

如前所述,原因是rootdisp下面的輸出值很大(使用assidfromntpq -c as作為輸入readvar):

root@akulab1:~# ntpq -nc 'readvar 34463'
associd=34463 status=9014 conf, reach, sel_reject, 1 event, reachable,
srcadr=172.16.0.25, srcport=123, dstadr=172.16.0.133, dstport=123,
leap=00, stratum=1, precision=-23, rootdelay=0.000, rootdisp=10684.280,
refid=LOCL, reftime=e5b7a483.b4d87c2d  Wed, Feb 16 2022 17:27:47.706,
rec=e5b88b72.a5c64a66  Thu, Feb 17 2022  9:53:06.647, reach=377,
unreach=0, hmode=3, pmode=4, hpoll=6, ppoll=6, headway=290,
flash=400 peer_dist, keyid=0, offset=3934.131, delay=0.516,
dispersion=0.987, jitter=14.653, xleave=0.037,
filtdelay=     0.52    0.52    0.56    0.54    0.54    0.58    0.45    0.49,
filtoffset= 3934.13 3930.84 3927.44 3924.11 3920.90 3917.58 3914.35 3911.63,
filtdisp=      0.00    1.02    2.06    3.08    4.10    5.12    6.11    6.95

最後,這些是添加和重新啟動 NTP 服務的tos maxdist 30命令/etc/ntp.conf

echo 'tos maxdist 30' >> /etc/ntp.conf
service ntp restart

而且,瞧 - 時間已成功與您的 NTP 伺服器同步:

root@akulab1:~# ntpq -c as
ind assid status  conf reach auth condition  last_event cnt
===========================================================
 1 60446  961a   yes   yes  none  sys.peer    sys_peer  1
root@akulab1:~# ntpq -np
    remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
*172.16.0.25     .LOCL.           1 u   15   64    1    0.432    0.314   0.171

引用自:https://unix.stackexchange.com/questions/677523