Systemd
Systemd Restart=always 不被接受
注意:我在 Medium 上寫了一篇文章,解釋瞭如何創建服務,以及如何避免這個特殊問題:使用 systemd 創建 Linux 服務。
原始問題:
我正在使用 systemd 來保持工作腳本始終工作:
[Unit] Description=My worker After=mysqld.service [Service] Type=simple Restart=always ExecStart=/path/to/script [Install] WantedBy=multi-user.target
雖然如果腳本在幾分鐘後正常退出,重啟工作正常,但我注意到如果它在啟動時反复執行失敗,
systemd
將放棄嘗試啟動它:Jun 14 11:10:31 localhost systemd[1]: test.service: Main process exited, code=exited, status=1/FAILURE Jun 14 11:10:31 localhost systemd[1]: test.service: Unit entered failed state. Jun 14 11:10:31 localhost systemd[1]: test.service: Failed with result 'exit-code'. Jun 14 11:10:31 localhost systemd[1]: test.service: Service hold-off time over, scheduling restart. Jun 14 11:10:31 localhost systemd[1]: test.service: Start request repeated too quickly. Jun 14 11:10:31 localhost systemd[1]: Failed to start My worker. Jun 14 11:10:31 localhost systemd[1]: test.service: Unit entered failed state. Jun 14 11:10:31 localhost systemd[1]: test.service: Failed with result 'start-limit'.
同樣,如果我的工作腳本多次失敗並退出狀態為
255
,systemd
則放棄嘗試重新啟動它:Jun 14 11:25:51 localhost systemd[1]: test.service: Failed with result 'exit-code'. Jun 14 11:25:51 localhost systemd[1]: test.service: Service hold-off time over, scheduling restart. Jun 14 11:25:51 localhost systemd[1]: test.service: Start request repeated too quickly. Jun 14 11:25:51 localhost systemd[1]: Failed to start My worker. Jun 14 11:25:51 localhost systemd[1]: test.service: Unit entered failed state. Jun 14 11:25:51 localhost systemd[1]: test.service: Failed with result 'start-limit'.
有沒有辦法強制
systemd
在幾秒鐘後總是重試?
我想稍微擴展一下 Rahul 的回答。
systemd 嘗試重新啟動多次 (
StartLimitBurst
),如果在 內達到嘗試計數,則停止嘗試StartLimitIntervalSec
。這兩個選項都屬於該[unit]
部分。執行之間的預設延遲是 100 毫秒 (
RestartSec
),這會導致非常快地達到速率限制。對於定義了重啟策略的單元,systemd 不會再嘗試自動重啟:
請注意,
Restart=
不再嘗試重新啟動已配置並達到啟動限制的單元;但是,它們仍然可以在稍後手動重新啟動,從那時起,重新啟動邏輯再次被啟動。拉胡爾的回答有幫助,因為較長的延遲會阻止在時間內到達錯誤計數器
StartLimitIntervalSec
。正確的答案是設置兩者RestartSec
並StartLimitBurst
設置合理的值。