除非手動 ping 出,Strongswan VPN 無法正常工作
我們已經成功地在我們的網路上建立了一個 strongswan vpn 來與 Google Cloud VPN 進行通信。
有時我們讓它閒置一段時間,比如說一個晚上,這就是問題出現的時候。如果我嘗試從 Google ping 到我們的網路,它不起作用,沒有數據包被傳輸。如果我嘗試從我們這邊 ping 到 Google,它可以正常工作,然後在 Google 這邊被阻止的 ping 開始正常工作。
看起來 StrongSwan 在我們這邊進入了睡眠模式,只有在我手動 ping 出來時才醒來,而不是在接收數據包時醒來。但是我在文件中找不到任何選項來解決這個問題,有人遇到這個問題並以某種方式解決了嗎?
編輯:我們這邊沒有防火牆可以解釋這種行為,在Google這邊我們只能設置允許通過防火牆的 IP 範圍,沒有別的。但由於它使用他們自己的 VPN 服務與我們的 strongswan 伺服器通信,我強烈懷疑它來自他們。
這是在我們這邊出現問題之前 ipsec 狀態返回的內容:
net-net[72]: ESTABLISHED 113 minutes ago, 79.xxx.xxx.xxx[79.xxx.xxx.xxx]...146.xxx.xxx.xxx[146.xxx.xxx.xxx] net-net{255}: INSTALLED, TUNNEL, reqid 24, ESP SPIs: c5xxxxxx 4exxxxxx net-net{255}: 192.168.0.0/24 192.168.17.0/24 === 10.132.0.0/20
以下是 ipsec statusall 之後返回的內容:
Status of IKE charon daemon (strongSwan 5.3.5, Linux 4.4.0-64-generic, x86_64): uptime: 22 days, since Feb 27 15:21:33 2017 malloc: sbrk 2568192, mmap 0, used 370288, free 2197904 worker threads: 11 of 16 idle, 5/0/0/0 working, job queue: 0/0/0/0, scheduled: 11 loaded plugins: charon aes agent attr connmark constraints dnskey fips-prf gcm md4 openssl pem pgp pkcs1 pkcs12 pkcs7 pkcs8 pubkey rc2 resolve revocation sshkey test-vectors x509 xcbc sha1 sha2 md5 gmp random nonce hmac stroke kernel-netlink socket-default updown Listening IP addresses: 192.168.17.205 79.xxx.xxx.xxx Connections: net-net: 79.xxx.xxx.xxx...146.xxx.xxx.xxx IKEv2, dpddelay=30s net-net: local: [79.xxx.xxx.xxx] uses pre-shared key authentication net-net: remote: [146.xxx.xxx.xxx] uses pre-shared key authentication net-net: child: 192.168.17.0/24 192.168.0.0/24 === 10.132.0.0/20 TUNNEL, dpdaction=restart Security Associations (1 up, 0 connecting): net-net[72]: ESTABLISHED 2 hours ago, 79.xxx.xxx.xxx[79.xxx.xxx.xxx]...146.xxx.xxx.xxx[146.xxx.xxx.xxx] net-net[72]: IKEv2 SPIs: 0fd4efxxxxxx 17ed000axxxxxx*, pre-shared key reauthentication in 108 minutes net-net[72]: IKE proposal: AES_CBC_128/HMAC_SHA1_96/PRF_HMAC_SHA1/MODP_2048 net-net{255}: INSTALLED, TUNNEL, reqid 24, ESP SPIs: c5b822fe_i 4ed83bd8_o net-net{255}: AES_GCM_16_128, 3916 bytes_i (47 pkts, 1020s ago), 3956 bytes_o (47 pkts, 1020s ago), rekeying in 7 hours net-net{255}: 192.168.0.0/24 192.168.17.0/24 === 10.132.0.0/20
和 ipsec.conf:
config setup conn %default ikelifetime=24h keylife=8h rekeymargin=9m keyingtries=1 authby=psk keyexchange=ikev2 mobike=no esp=aes128gcm16-modp2048! dpdaction=restart conn net-net left=79.xxx.xxx.xxx leftsubnet=192.168.17.0/24,192.168.0.0/24 leftid=79.xxx.xxx.xxx leftfirewall=yes leftdns=xxx.... right=146.xxx.xxx.xxx rightsubnet=10.132.0.0/20 rightid=146.xxx.xxx.xxx auto=start
在來自Google方面的日誌中,我注意到在我發送 ping 測試的那一刻,它發送了一些重新創建 CHILD_SA 的請求:
"creating rekey job for CHILD_SA ESP/0xxxxxxxxx/79.xxx.xxx.xxx" ...
一旦 CHILD_SA 通過其 SPI 建立,ping 就會通過。儘管 ESP SPI 前後沒有變化。我還在 ipsec statusall 上看到了 7 小時內重新輸入密鑰。會不會是夜間超過 7 小時沒有活動的問題?
這是charon日誌:
Mar 22 07:56:43 vpn07 charon: 11[ENC] parsed CREATE_CHILD_SA request 223 [ N(REKEY_SA) SA No KE TSi TSr ] Mar 22 07:56:43 vpn07 charon: 11[IKE] CHILD_SA net-net{255} established with SPIs c5b8xxxxxxx_o and TS 192.168.0.0/24 192.168.17.0/24 === 10.132.0.0/20 Mar 22 07:56:43 vpn07 charon: 11[ENC] generating CREATE_CHILD_SA response 223 [ SA No KE TSi TSr ] Mar 22 07:56:43 vpn07 charon: 05[IKE] received DELETE for ESP CHILD_SA with SPI 7dd6xxxx Mar 22 07:56:43 vpn07 charon: 05[IKE] closing CHILD_SA net-net{254} with SPIs ce7xxxx (95264 bytes) 7ddxxxxx (4885433 bytes) and TS 192.168.0.0/24 192.168.17.0/24 === 10.132.0.0/20 Mar 22 07:56:43 vpn07 charon: 05[IKE] sending DELETE for ESP CHILD_SA with SPI ce75xxxxx Mar 22 07:56:43 vpn07 charon: 05[IKE] CHILD_SA closed
和Google日誌:
D sending DPD request D CHILD_SA closed D received DELETE for ESP CHILD_SA with SPI cexxxxx D parsed INFORMATIONAL response 224 [ D ] D received packet: from 79.xxx.xxx.xxx[500] to 146.xxx.xxx.xxx[500] (76 bytes) D sending packet: from 146.xxx.xxx.xxx[500] to 79.xxx.xxx.xxx[500] (76 bytes) D generating INFORMATIONAL request 224 [ D ] D sending DELETE for ESP CHILD_SA with SPI 7dxxxxxx I closing CHILD_SA vpn_79.xxx.xxx.xxx{33} with SPIs 7dxxxxx (5073648 bytes) cexxxxxx (95264 bytes) and TS 10.132.0.0/20 === 192.168.0.0/24 192.168.17.0/24 I CHILD_SA vpn_79.xxx.xxx.xxx{34} established with SPIs 4exxxxxx c5xxxxxx and TS 10.132.0.0/20 === 192.168.0.0/24 192.168.17.0/24 D handling HA CHILD_SA vpn_79.xxx.xxx.xxx{34} 10.132.0.0/20 === 192.168.0.0/24 192.168.17.0/24 (segment in: 1*, out: 1*) D parsed CREATE_CHILD_SA response 223 [ SA No KE TSi TSr ] D received packet: from 79.xxx.xxx.xxx[500] to 146.xxx.xxx.xxx[500] (476 bytes) D sending packet: from 146.xxx.xxx.xxx[500] to 79.xxx.xxx.xxx[500] (620 bytes) D generating CREATE_CHILD_SA request 223 [ N(REKEY_SA) SA No KE TSi TSr ] I establishing CHILD_SA vpn_79.xxx.xxx.xxx{1} D creating rekey job for CHILD_SA ESP/0xxxxxxx/79.xxx.xxx.xxx D parsed INFORMATIONAL response 222 [ ] D received packet: from 79.xxx.xxx.xxx[500] to 146.xxx.xxx.xxx[500] (76 bytes) D sending packet: from 146.xxx.xxx.xxx[500] to 79.xxx.xxx.xxx[500] (76 bytes) D generating INFORMATIONAL request 222 [ ] D sending DPD request
看起來您的 Strongswan VPN 客戶端位於防火牆或NAT設備後面,在一段時間不活動後,該設備會斷開“連接”(這裡可能是 UDP,術語“連接”不是一個好的選擇)。屬於該連接的任何傳入數據都將被視為無效並被丟棄(您的 FW/NAT 設備日誌中可能有一行關於此的內容)。稍後,當您從內部 ping Google 時,您的連接會重新建立,並且您的防火牆/NAT 設備現在再次認為傳入的數據是有效的。
解決方案是通過確保正常數據流(每分鐘一個 UDP 消息可能就足夠了)來防止您的防火牆/NAT 設備丟棄“連接”。搜尋 Strongswan 中內置的任何保活機制並啟動它。