Linux-Kernel
為什麼在幾乎不使用交換的情況下OOM殺手會殺死程序?
我有一個基於 ARM 的伺服器,它的可定址記憶體不到 2GB,並且啟動了 4GB 交換:
root@bang:~> free -m total used free shared buff/cache available Mem: 1976 388 48 15 1539 1487 Swap: 4095 1 4094
一旦系統執行了一天左右,OOM 殺手開始變得有點激進並開始殺死東西:
Aug 3 12:59:01 bang kernel: [51585.822794] dump1090 invoked oom-killer: gfp_mask=0x24040c0(GFP_KERNEL|__GFP_COMP), order=2, oom_score_adj=0 Aug 3 12:59:01 bang kernel: [51585.822851] dump1090 cpuset=/ mems_allowed=0 Aug 3 12:59:01 bang kernel: [51585.822963] CPU: 6 PID: 25989 Comm: dump1090 Tainted: G C 4.7.0-41238-g206dbde-dirty #16 Aug 3 12:59:01 bang kernel: [51585.823010] Hardware name: SAMSUNG EXYNOS (Flattened Device Tree) Aug 3 12:59:01 bang kernel: [51585.823120] [<c010e4ec>] (unwind_backtrace) from [<c010b234>] (show_stack+0x10/0x14) Aug 3 12:59:01 bang kernel: [51585.823203] [<c010b234>] (show_stack) from [<c04eff84>] (dump_stack+0x88/0x9c) Aug 3 12:59:01 bang kernel: [51585.823283] [<c04eff84>] (dump_stack) from [<c0227830>] (dump_header+0x5c/0x1b0) Aug 3 12:59:01 bang kernel: [51585.823357] [<c0227830>] (dump_header) from [<c01d1aec>] (oom_kill_process+0x328/0x494) Aug 3 12:59:01 bang kernel: [51585.823420] [<c01d1aec>] (oom_kill_process) from [<c01d1fa0>] (out_of_memory+0x2e0/0x338) Aug 3 12:59:01 bang kernel: [51585.823487] [<c01d1fa0>] (out_of_memory) from [<c01d6724>] (__alloc_pages_nodemask+0xd80/0xda0) Aug 3 12:59:01 bang kernel: [51585.823555] [<c01d6724>] (__alloc_pages_nodemask) from [<c01d6a28>] (alloc_kmem_pages+0x18/0xb0) Aug 3 12:59:01 bang kernel: [51585.823620] [<c01d6a28>] (alloc_kmem_pages) from [<c01ee7a4>] (kmalloc_order+0x10/0x20) Aug 3 12:59:01 bang kernel: [51585.823688] [<c01ee7a4>] (kmalloc_order) from [<c06435b4>] (proc_submiturb+0x60c/0xe88) Aug 3 12:59:01 bang kernel: [51585.823749] [<c06435b4>] (proc_submiturb) from [<c06446e4>] (usbdev_do_ioctl+0x8b4/0x1bfc) Aug 3 12:59:01 bang kernel: [51585.823816] [<c06446e4>] (usbdev_do_ioctl) from [<c023c74c>] (do_vfs_ioctl+0x98/0x8e4) Aug 3 12:59:01 bang kernel: [51585.823879] [<c023c74c>] (do_vfs_ioctl) from [<c023d004>] (SyS_ioctl+0x6c/0x7c) Aug 3 12:59:01 bang kernel: [51585.823948] [<c023d004>] (SyS_ioctl) from [<c0107740>] (ret_fast_syscall+0x0/0x3c) Aug 3 12:59:01 bang kernel: [51585.823987] Mem-Info: Aug 3 12:59:01 bang kernel: [51585.824073] active_anon:43846 inactive_anon:46454 isolated_anon:0 Aug 3 12:59:01 bang kernel: [51585.824073] active_file:132799 inactive_file:109909 isolated_file:19 Aug 3 12:59:01 bang kernel: [51585.824073] unevictable:1408 dirty:56 writeback:0 unstable:0 Aug 3 12:59:01 bang kernel: [51585.824073] slab_reclaimable:17104 slab_unreclaimable:6387 Aug 3 12:59:01 bang kernel: [51585.824073] mapped:13368 shmem:3582 pagetables:971 bounce:0 Aug 3 12:59:01 bang kernel: [51585.824073] free:92967 free_pcp:31 free_cma:32601 Aug 3 12:59:01 bang kernel: [51585.824216] Normal free:13240kB min:3420kB low:4272kB high:5124kB active_anon:26652kB inactive_anon:26692kB active_file:360240kB inactive_file:194904kB unevictable:1336kB isolated(anon):0kB isolated(file):76kB present:770048kB managed:736192kB mlocked:1336kB dirty:16kB writeback:0kB mapped:11600kB shmem:900kB slab_reclaimable:68416kB slab_unreclaimable:25548kB kernel_stack:3384kB pagetables:3884kB unstable:0kB bounce:0kB free_pcp:124kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no Aug 3 12:59:01 bang kernel: [51585.824259] lowmem_reserve[]: 0 9040 9040 Aug 3 12:59:01 bang kernel: [51585.824442] HighMem free:358664kB min:512kB low:1864kB high:3216kB active_anon:148732kB inactive_anon:159124kB active_file:170956kB inactive_file:244732kB unevictable:4296kB isolated(anon):0kB isolated(file):0kB present:1288192kB managed:1288192kB mlocked:4296kB dirty:208kB writeback:0kB mapped:41872kB shmem:13428kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:130404kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no Aug 3 12:59:01 bang kernel: [51585.824483] lowmem_reserve[]: 0 0 0 Aug 3 12:59:01 bang kernel: [51585.824592] Normal: 1300*4kB (UMEH) 525*8kB (UMEH) 11*16kB (H) 9*32kB (H) 8*64kB (H) 5*128kB (H) 3*256kB (H) 1*512kB (H) 1*1024kB (H) 0*2048kB 0*4096kB = 13320kB Aug 3 12:59:01 bang kernel: [51585.825061] HighMem: 1212*4kB (UMC) 538*8kB (UM) 160*16kB (UM) 140*32kB (UMC) 108*64kB (UMC) 34*128kB (UM) 19*256kB (UMC) 10*512kB (UM) 8*1024kB (UMC) 7*2048kB (UMC) 73*4096kB (UMC) = 358976kB Aug 3 12:59:01 bang kernel: [51585.825558] 247387 total pagecache pages Aug 3 12:59:01 bang kernel: [51585.825596] 18 pages in swap cache Aug 3 12:59:01 bang kernel: [51585.825636] Swap cache stats: add 1360, delete 1342, find 33/71 Aug 3 12:59:01 bang kernel: [51585.825672] Free swap = 4190368kB Aug 3 12:59:01 bang kernel: [51585.825705] Total swap = 4194300kB Aug 3 12:59:01 bang kernel: [51585.825739] 514560 pages RAM Aug 3 12:59:01 bang kernel: [51585.825772] 322048 pages HighMem/MovableOnly Aug 3 12:59:01 bang kernel: [51585.825804] 8464 pages reserved Aug 3 12:59:01 bang kernel: [51585.825836] 32768 pages cma reserved Aug 3 12:59:01 bang kernel: [51585.825869] [ pid ] uid tgid total_vm rss nr_ptes nr_pmds swapents oom_score_adj name Aug 3 12:59:01 bang kernel: [51585.825958] [ 2363] 0 2363 2724 664 8 0 13 -1000 systemd-udevd Aug 3 12:59:01 bang kernel: [51585.826019] [ 4035] 0 4035 1736 445 7 0 16 0 syslog-ng Aug 3 12:59:01 bang kernel: [51585.826073] [ 4036] 0 4036 11306 1067 15 0 38 0 syslog-ng Aug 3 12:59:01 bang kernel: [51585.826123] [ 4037] 0 4037 1149 639 7 0 0 0 log_to_sql.sh Aug 3 12:59:01 bang kernel: [51585.826173] [ 4235] 60 4235 57365 13082 62 0 881 0 mysqld Aug 3 12:59:01 bang kernel: [51585.826222] [ 4283] 107 4283 2557 1006 9 0 0 0 ulogd Aug 3 12:59:01 bang kernel: [51585.826268] [ 4698] 0 4698 899 404 5 0 0 0 pppd Aug 3 12:59:01 bang kernel: [51585.826316] [ 4762] 105 4762 1183 472 6 0 0 0 dnsmasq Aug 3 12:59:01 bang kernel: [51585.826363] [ 4970] 0 4970 1292 542 7 0 0 -1000 sshd Aug 3 12:59:01 bang kernel: [51585.826410] [ 5079] 0 5079 32467 4668 25 0 0 0 apache2 Aug 3 12:59:01 bang kernel: [51585.826457] [ 5081] 81 5081 168576 28259 140 0 0 0 apache2 Aug 3 12:59:01 bang kernel: [51585.826504] [ 5082] 81 5082 173465 34888 154 0 0 0 apache2 Aug 3 12:59:01 bang kernel: [51585.826550] [ 5211] 0 5211 594 29 5 0 0 0 atd Aug 3 12:59:01 bang kernel: [51585.826597] [ 5239] 102 5239 777 430 5 0 0 0 dbus-daemon Aug 3 12:59:01 bang kernel: [51585.826644] [ 5299] 103 5299 2665 2156 11 0 0 0 dhcpd Aug 3 12:59:01 bang kernel: [51585.826691] [ 5365] 240 5365 601 209 5 0 0 0 distccd Aug 3 12:59:01 bang kernel: [51585.826738] [ 5366] 240 5366 601 25 4 0 0 0 distccd Aug 3 12:59:01 bang kernel: [51585.826784] [ 5399] 123 5399 1874 1411 10 0 0 0 ntpd Aug 3 12:59:01 bang kernel: [51585.826830] [ 5428] 240 5428 601 25 4 0 0 0 distccd Aug 3 12:59:01 bang kernel: [51585.826876] [ 5433] 0 5433 929 617 7 0 0 0 dovecot Aug 3 12:59:01 bang kernel: [51585.826922] [ 5443] 97 5443 700 512 6 0 0 0 anvil Aug 3 12:59:01 bang kernel: [51585.826968] [ 5444] 0 5444 733 561 5 0 0 0 log Aug 3 12:59:01 bang kernel: [51585.827015] [ 5470] 8 5470 10720 1045 14 0 0 0 exim Aug 3 12:59:01 bang kernel: [51585.827061] [ 5477] 240 5477 601 25 4 0 0 0 distccd Aug 3 12:59:01 bang kernel: [51585.827107] [ 5497] 240 5497 601 25 4 0 0 0 distccd Aug 3 12:59:01 bang kernel: [51585.827153] [ 5500] 0 5500 20882 2674 21 0 0 0 fail2ban-server Aug 3 12:59:01 bang kernel: [51585.827199] [ 5502] 0 5502 1677 1007 7 0 0 0 screen Aug 3 12:59:01 bang kernel: [51585.827246] [ 5503] 240 5503 601 25 4 0 0 0 distccd Aug 3 12:59:01 bang kernel: [51585.827291] [ 5504] 0 5504 1295 804 8 0 0 0 bash Aug 3 12:59:01 bang kernel: [51585.827339] [ 5505] 0 5505 1347 704 6 0 0 0 top Aug 3 12:59:01 bang kernel: [51585.827385] [ 5506] 0 5506 842 102 5 0 0 0 tail Aug 3 12:59:01 bang kernel: [51585.827431] [ 5507] 0 5507 842 100 6 0 0 0 tail Aug 3 12:59:01 bang kernel: [51585.827477] [ 5510] 0 5510 1150 584 7 0 0 0 multitail.sh Aug 3 12:59:01 bang kernel: [51585.827524] [ 5519] 0 5519 2466 1794 9 0 0 0 multitail Aug 3 12:59:01 bang kernel: [51585.827572] [ 5526] 0 5526 941 651 6 0 0 0 gam_server Aug 3 12:59:01 bang kernel: [51585.827618] [ 5527] 0 5527 842 108 6 0 0 0 tail Aug 3 12:59:01 bang kernel: [51585.827664] [ 5528] 0 5528 842 105 5 0 0 0 tail Aug 3 12:59:01 bang kernel: [51585.827710] [ 5529] 0 5529 842 100 5 0 0 0 tail Aug 3 12:59:01 bang kernel: [51585.827756] [ 5530] 0 5530 842 355 6 0 0 0 tail Aug 3 12:59:01 bang kernel: [51585.827802] [ 5531] 0 5531 843 386 6 0 0 0 tail Aug 3 12:59:01 bang kernel: [51585.827848] [ 5532] 240 5532 601 25 4 0 0 0 distccd Aug 3 12:59:01 bang kernel: [51585.827894] [ 5550] 240 5550 601 25 4 0 0 0 distccd Aug 3 12:59:01 bang kernel: [51585.827940] [ 5622] 0 5622 615 442 5 0 0 0 rpcbind Aug 3 12:59:01 bang kernel: [51585.827986] [ 5634] 240 5634 601 25 4 0 0 0 distccd Aug 3 12:59:01 bang kernel: [51585.828032] [ 5652] 0 5652 787 572 5 0 0 0 rpc.statd Aug 3 12:59:01 bang kernel: [51585.828078] [ 5707] 0 5707 789 46 5 0 0 0 rpc.idmapd Aug 3 12:59:01 bang kernel: [51585.828124] [ 5733] 240 5733 601 25 4 0 0 0 distccd Aug 3 12:59:01 bang kernel: [51585.828170] [ 5747] 0 5747 856 497 5 0 0 0 rpc.mountd Aug 3 12:59:01 bang kernel: [51585.828220] [ 5804] 101 5804 562 367 6 0 0 0 radvd Aug 3 12:59:01 bang kernel: [51585.828266] [ 5805] 0 5805 562 239 6 0 0 0 radvd Aug 3 12:59:01 bang kernel: [51585.828313] [ 5839] 240 5839 601 25 4 0 0 0 distccd Aug 3 12:59:01 bang kernel: [51585.828359] [ 5860] 0 5860 1150 618 5 0 0 0 heating.sh Aug 3 12:59:01 bang kernel: [51585.828405] [ 5898] 0 5898 1007 451 6 0 0 0 agetty Aug 3 12:59:01 bang kernel: [51585.828451] [ 5899] 0 5899 1007 436 7 0 0 0 agetty Aug 3 12:59:01 bang kernel: [51585.828497] [ 5900] 0 5900 1007 419 6 0 0 0 agetty Aug 3 12:59:01 bang kernel: [51585.828543] [ 5901] 0 5901 1007 435 5 0 0 0 agetty Aug 3 12:59:01 bang kernel: [51585.828589] [ 5902] 0 5902 1007 436 6 0 0 0 agetty Aug 3 12:59:01 bang kernel: [51585.828779] [ 5903] 0 5903 1007 449 7 0 0 0 agetty Aug 3 12:59:01 bang kernel: [51585.828827] [ 5904] 0 5904 609 420 5 0 0 0 agetty Aug 3 12:59:01 bang kernel: [51585.828875] [ 6004] 0 6004 1455 921 7 0 0 0 bluetoothd Aug 3 12:59:01 bang kernel: [51585.828926] [ 6010] 0 6010 39540 7714 43 0 0 0 python2 Aug 3 12:59:01 bang kernel: [51585.828974] [ 3224] 0 3224 2247 1027 10 0 0 0 sshd Aug 3 12:59:01 bang kernel: [51585.829021] [ 3227] 1000 3227 2247 945 9 0 0 0 sshd Aug 3 12:59:01 bang kernel: [51585.829066] [ 3228] 1000 3228 1298 774 8 0 0 0 bash Aug 3 12:59:01 bang kernel: [51585.829111] [ 3236] 1000 3236 1347 645 6 0 0 0 su Aug 3 12:59:01 bang kernel: [51585.829155] [ 3238] 0 3238 1298 799 7 0 0 0 bash Aug 3 12:59:01 bang kernel: [51585.829202] [ 880] 0 880 1082 759 7 0 0 0 config Aug 3 12:59:01 bang kernel: [51585.829247] [ 1099] 106 1099 1327 1093 7 0 0 0 imap-login Aug 3 12:59:01 bang kernel: [51585.829334] [ 1111] 8 1111 1046 872 6 0 0 0 imap Aug 3 12:59:01 bang kernel: [51585.829449] [10717] 0 10717 1299 765 7 0 0 0 bash Aug 3 12:59:01 bang kernel: [51585.829564] [10784] 0 10784 2885 1232 9 0 0 0 mysql Aug 3 12:59:01 bang kernel: [51585.829701] [16321] 40 16321 32298 9969 39 0 0 0 named Aug 3 12:59:01 bang kernel: [51585.829900] [24379] 0 24379 996 411 6 0 0 0 cron Aug 3 12:59:01 bang kernel: [51585.830042] [25814] 0 25814 2270 1056 10 0 0 0 sshd Aug 3 12:59:01 bang kernel: [51585.830162] [25818] 1000 25818 2304 943 8 0 0 0 sshd Aug 3 12:59:01 bang kernel: [51585.830290] [25819] 1000 25819 1298 769 6 0 0 0 bash Aug 3 12:59:01 bang kernel: [51585.830405] [25827] 1000 25827 1347 642 7 0 0 0 su Aug 3 12:59:01 bang kernel: [51585.830505] [25828] 0 25828 1298 760 8 0 0 0 bash Aug 3 12:59:01 bang kernel: [51585.830620] [25834] 0 25834 1242 565 7 0 0 0 screen Aug 3 12:59:01 bang kernel: [51585.830753] [12903] 0 12903 1299 788 7 0 0 0 bash Aug 3 12:59:01 bang kernel: [51585.830872] [25975] 0 25975 6895 579 11 0 0 0 dump1090 Aug 3 12:59:01 bang kernel: [51585.831006] Out of memory: Kill process 5082 (apache2) score 22 or sacrifice child Aug 3 12:59:01 bang kernel: [51585.832683] Killed process 5082 (apache2) total-vm:693860kB, anon-rss:118856kB, file-rss:13300kB, shmem-rss:7396kB
問題是,幾乎沒有使用交換。為什麼沒有任何東西被換掉而不是呼叫OOM殺手?
這是虛擬機詳細資訊:
root@bang:~> grep '' /proc/sys/vm/* /proc/sys/vm/admin_reserve_kbytes:8192 /proc/sys/vm/block_dump:0 grep: /proc/sys/vm/compact_memory: Permission denied /proc/sys/vm/compact_unevictable_allowed:1 /proc/sys/vm/dirty_background_bytes:0 /proc/sys/vm/dirty_background_ratio:10 /proc/sys/vm/dirty_bytes:0 /proc/sys/vm/dirty_expire_centisecs:3000 /proc/sys/vm/dirty_ratio:20 /proc/sys/vm/dirtytime_expire_seconds:43200 /proc/sys/vm/dirty_writeback_centisecs:500 /proc/sys/vm/drop_caches:0 /proc/sys/vm/extfrag_threshold:500 /proc/sys/vm/highmem_is_dirtyable:0 /proc/sys/vm/laptop_mode:0 /proc/sys/vm/legacy_va_layout:0 /proc/sys/vm/lowmem_reserve_ratio:32 32 /proc/sys/vm/max_map_count:65530 /proc/sys/vm/min_free_kbytes:3420 /proc/sys/vm/mmap_min_addr:4096 /proc/sys/vm/mmap_rnd_bits:8 /proc/sys/vm/nr_pdflush_threads:0 /proc/sys/vm/oom_dump_tasks:1 /proc/sys/vm/oom_kill_allocating_task:0 /proc/sys/vm/overcommit_kbytes:0 /proc/sys/vm/overcommit_memory:0 /proc/sys/vm/overcommit_ratio:50 /proc/sys/vm/page-cluster:3 /proc/sys/vm/panic_on_oom:0 /proc/sys/vm/percpu_pagelist_fraction:0 /proc/sys/vm/stat_interval:1 /proc/sys/vm/swappiness:50 /proc/sys/vm/user_reserve_kbytes:62869 /proc/sys/vm/vfs_cache_pressure:100 /proc/sys/vm/watermark_scale_factor:10
核心是帶有一些 Exynos 更新檔的主線 4.7:
Linux bang 4.7.0-41238-g206dbde-dirty #16 SMP PREEMPT Tue Aug 2 22:35:38 BST 2016 armv7l SAMSUNG EXYNOS (Flattened Device Tree) GNU/Linux
現在,由於我自己建構了核心,我完全有可能在某個地方選錯了。任何幫助,將不勝感激。
$$ EDIT1 $$: 這似乎發生在 I/O 使用率很高的情況下,但我還沒有確定這是由於記憶體填充還是其他原因。 $$ EDIT2 $$: 在核心郵件列表上似乎正在進行(此時)討論,討論似乎是一個相同的問題。我會監控它並報告結果。
這是由 Linux 核心 4.7.0 到 4.7.4 中存在的核心錯誤引起的(由4.7.5 中的此送出和4.8.0 中的此送出修復)。
FC25 的核心 4.8.8 出現此錯誤,用法與上述相同,apache 和 dovecot。根據https://marius.bloggt-in-braunschweig.de/category/fedora/它是文件記憶體的碎片,並且有一種解決方法可以通過 cron 定期清除它,直到修復進入 4.9:
同步 && 迴聲 1 > /proc/sys/vm/drop_caches