Linux
記憶體不足,但交換可用
即使有可用的交換,我的伺服器也會耗盡記憶體。
為什麼?
我可以這樣重現它:
eat_20GB_RAM() { perl -e '$a="c"x10000000000;print "OK\n";sleep 10000'; } export -f eat_20GB_RAM parallel -j0 eat_20GB_RAM ::: {1..25} &
當它穩定下來(即所有程序進入睡眠狀態)時,我再執行一些:
parallel --delay 5 -j0 eat_20GB_RAM ::: {1..25} &
當穩定(即所有程序進入睡眠狀態)時,使用大約 800 GB RAM/swap:
$ free -m total used free shared buff/cache available Mem: 515966 440676 74514 1 775 73392 Swap: 1256720 341124 915596
當我再跑幾個時:
parallel --delay 15 -j0 eat_20GB_RAM ::: {1..50} &
我開始得到:
Out of memory!
即使有明顯的交換可用。
$ free total used free shared buff/cache available Mem: 528349276 518336524 7675784 14128 2336968 7316984 Swap: 1286882284 1017746244 269136040
為什麼?
$ cat /proc/meminfo MemTotal: 528349276 kB MemFree: 7647352 kB MemAvailable: 7281164 kB Buffers: 70616 kB Cached: 1503044 kB SwapCached: 10404 kB Active: 476833404 kB Inactive: 20837620 kB Active(anon): 476445828 kB Inactive(anon): 19673864 kB Active(file): 387576 kB Inactive(file): 1163756 kB Unevictable: 18776 kB Mlocked: 18776 kB SwapTotal: 1286882284 kB SwapFree: 269134804 kB Dirty: 0 kB Writeback: 0 kB AnonPages: 496106244 kB Mapped: 190524 kB Shmem: 14128 kB KReclaimable: 753204 kB Slab: 15772584 kB SReclaimable: 753204 kB SUnreclaim: 15019380 kB KernelStack: 46640 kB PageTables: 3081488 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB CommitLimit: 1551056920 kB Committed_AS: 1549560424 kB VmallocTotal: 34359738367 kB VmallocUsed: 1682132 kB VmallocChunk: 0 kB Percpu: 202752 kB HardwareCorrupted: 0 kB AnonHugePages: 0 kB ShmemHugePages: 0 kB ShmemPmdMapped: 0 kB FileHugePages: 0 kB FilePmdMapped: 0 kB CmaTotal: 0 kB CmaFree: 0 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB Hugetlb: 0 kB DirectMap4k: 12251620 kB DirectMap2M: 522496000 kB DirectMap1G: 3145728 kB
在
/proc/meminfo
你發現:CommitLimit: 1551056920 kB Committed_AS: 1549560424 kB
所以你處於送出限制。
如果您通過以下方式禁用了記憶體過度使用(以避免OOM-killer):
echo 2 > /proc/sys/vm/overcommit_memory
然後送出限制計算為:
2 - Don't overcommit. The total address space commit for the system is not permitted to exceed swap + a configurable amount (default is 50%) of physical RAM. Depending on the amount you use, in most situations this means a process will not be killed while accessing pages but will receive errors on memory allocation as appropriate.
(來自:https ://www.kernel.org/doc/Documentation/vm/overcommit-accounting )
您可以通過以下方式使用全部記憶體:
echo 100 > /proc/sys/vm/overcommit_ratio
然後,當物理 RAM 和交換都被保留時,您將出現記憶體不足。
在這種情況下,這個名字
overcommit_ratio
有點誤導:你沒有過度使用任何東西。即使使用此設置,您也可能會在交換耗盡之前看到記憶體不足。malloc.c:
#include <stdio.h> #include <malloc.h> #include <stdlib.h> #include <unistd.h> void main(int argc, char **argv) { long bytes, sleep_sec; if(argc != 3) { printf("Usage: malloc bytes sleep_sec\n"); exit(1); } sscanf(argv[1],"%ld",&bytes); sscanf(argv[2],"%ld",&sleep_sec); printf("Bytes: %ld Sleep: %ld\n",bytes,sleep_sec); if(malloc(bytes)) { sleep(sleep_sec); } else { printf("Out of memory\n"); exit(1); } }
編譯為:
gcc -o malloc malloc.c
執行方式(保留 1 GB 10 秒):
./malloc 1073741824 10
如果你執行它,即使有免費交換,你也可能會看到 OOM:
# Plenty of ram+swap free before we start $ free -m total used free shared buff/cache available Mem: 515966 2824 512361 16 780 511234 Swap: 1256720 0 1256720 # Reserve 1.8 TB $ ./malloc 1800000000000 100 & Bytes: 1800000000000 Sleep: 100 # It looks as if there is plenty of ram+swap free $ free -m total used free shared buff/cache available Mem: 515966 2824 512361 16 780 511234 Swap: 1256720 0 1256720 # But there isn't: It is all reserved (just not used yet) $ cat /proc/meminfo |grep omm CommitLimit: 1815231560 kB Committed_AS: 1761680484 kB # Thus this fails (as you would expect) $ ./malloc 180000000000 100 Bytes: 180000000000 Sleep: 100 Out of memory
因此,雖然
free
在實踐中經常會做正確的事情,但查看 CommitLimit 和 Committed_AS 似乎更防彈。