impala-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matthew Jacobs (JIRA)" <j...@apache.org>
Subject [jira] [Created] (IMPALA-5297) free-pool-test may be OOM killed on jenkins.impala.io runs
Date Tue, 09 May 2017 20:51:04 GMT
Matthew Jacobs created IMPALA-5297:
--------------------------------------

             Summary: free-pool-test may be OOM killed on jenkins.impala.io runs
                 Key: IMPALA-5297
                 URL: https://issues.apache.org/jira/browse/IMPALA-5297
             Project: IMPALA
          Issue Type: Bug
          Components: Infrastructure
    Affects Versions: Impala 2.9.0
            Reporter: Matthew Jacobs
            Priority: Critical


On gerrit-verify-dryrun jobs, while attempting to submit [a change to update the Kudu version|https://gerrit.cloudera.org/#/c/6797/]
seems to cause the free-pool-test to run out of memory.

The free-pool-test makes some large allocations (I think around 7gb in total), but when there
are other processes running, it seems the gerrit jobs may be getting close to the 15gb CommitLimit
on these aws hosts.

Here's the output from the kern.log
{code}
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153878] java invoked oom-killer: gfp_mask=0x201da,
order=0, oom_score_adj=0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153882] java cpuset=/ mems_allowed=0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153884] CPU: 1 PID: 19555 Comm: java Not tainted
3.13.0-100-generic #147-Ubuntu
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153886] Hardware name: Xen HVM domU, BIOS 4.2.amazon
02/16/2017
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153887]  0000000000000000 ffff88066708f970 ffffffff8172a4bb
ffff88047b3b1800
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153891]  0000000000000000 ffff88066708f9f8 ffffffff81724a5a
0000000000000000
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153894]  0000000000000000 0000000000000000 0000000000000000
0000000000000000
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153897] Call Trace:
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153904]  [<ffffffff8172a4bb>] dump_stack+0x64/0x82
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153908]  [<ffffffff81724a5a>] dump_header+0x7f/0x1f1
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153912]  [<ffffffff81155d11>] oom_kill_process+0x201/0x360
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153917]  [<ffffffff812dcab5>] ? security_capable_noaudit+0x15/0x20
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153919]  [<ffffffff811564a1>] out_of_memory+0x471/0x4b0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153922]  [<ffffffff8115c7bc>] __alloc_pages_nodemask+0xa6c/0xb90
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153926]  [<ffffffff8119ae83>] alloc_pages_current+0xa3/0x160
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153930]  [<ffffffff811527c7>] __page_cache_alloc+0x97/0xc0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153932]  [<ffffffff81154235>] filemap_fault+0x185/0x410
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153936]  [<ffffffff8117944f>] __do_fault+0x6f/0x530
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153941]  [<ffffffff810135db>] ? __switch_to+0x16b/0x4f0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153943]  [<ffffffff8117d2a2>] handle_mm_fault+0x482/0xf00
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153947]  [<ffffffff81090df7>] ? hrtimer_try_to_cancel+0x47/0x100
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153950]  [<ffffffff8172df0e>] ? schedule_hrtimeout_range_clock+0xce/0x170
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153954]  [<ffffffff81736644>] __do_page_fault+0x184/0x560
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153957]  [<ffffffff8120a45f>] ? ep_poll+0x2ff/0x330
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153961]  [<ffffffff8109d2f0>] ? wake_up_state+0x20/0x20
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153964]  [<ffffffff81736a3a>] do_page_fault+0x1a/0x70
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153966]  [<ffffffff8120b5cc>] ? SyS_epoll_wait+0xac/0x100
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153968]  [<ffffffff81732d68>] page_fault+0x28/0x30
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153970] Mem-Info:
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153971] Node 0 DMA per-cpu:
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153973] CPU    0: hi:    0, btch:   1 usd:  
0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153974] CPU    1: hi:    0, btch:   1 usd:  
0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153975] CPU    2: hi:    0, btch:   1 usd:  
0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153976] CPU    3: hi:    0, btch:   1 usd:  
0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153977] CPU    4: hi:    0, btch:   1 usd:  
0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153978] CPU    5: hi:    0, btch:   1 usd:  
0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153979] CPU    6: hi:    0, btch:   1 usd:  
0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153980] CPU    7: hi:    0, btch:   1 usd:  
0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153981] CPU    8: hi:    0, btch:   1 usd:  
0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153982] CPU    9: hi:    0, btch:   1 usd:  
0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153984] CPU   10: hi:    0, btch:   1 usd:  
0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153985] CPU   11: hi:    0, btch:   1 usd:  
0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153986] CPU   12: hi:    0, btch:   1 usd:  
0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153987] CPU   13: hi:    0, btch:   1 usd:  
0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153988] CPU   14: hi:    0, btch:   1 usd:  
0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153989] CPU   15: hi:    0, btch:   1 usd:  
0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153990] Node 0 DMA32 per-cpu:
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153991] CPU    0: hi:  186, btch:  31 usd:  
0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153992] CPU    1: hi:  186, btch:  31 usd:  
0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153993] CPU    2: hi:  186, btch:  31 usd:  
0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153995] CPU    3: hi:  186, btch:  31 usd:  
0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153996] CPU    4: hi:  186, btch:  31 usd:  
0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153997] CPU    5: hi:  186, btch:  31 usd:  
0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153998] CPU    6: hi:  186, btch:  31 usd:  
0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153999] CPU    7: hi:  186, btch:  31 usd:  
0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154001] CPU    8: hi:  186, btch:  31 usd:  
0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154002] CPU    9: hi:  186, btch:  31 usd:  
0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154003] CPU   10: hi:  186, btch:  31 usd:  
0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154004] CPU   11: hi:  186, btch:  31 usd:  
0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154005] CPU   12: hi:  186, btch:  31 usd:  
0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154006] CPU   13: hi:  186, btch:  31 usd:  
0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154007] CPU   14: hi:  186, btch:  31 usd:  
0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154008] CPU   15: hi:  186, btch:  31 usd:  
0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154009] Node 0 Normal per-cpu:
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154010] CPU    0: hi:  186, btch:  31 usd:  
0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154011] CPU    1: hi:  186, btch:  31 usd:  
0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154012] CPU    2: hi:  186, btch:  31 usd:  
0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154013] CPU    3: hi:  186, btch:  31 usd:  
0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154014] CPU    4: hi:  186, btch:  31 usd:  
0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154015] CPU    5: hi:  186, btch:  31 usd:  
0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154016] CPU    6: hi:  186, btch:  31 usd:  
0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154017] CPU    7: hi:  186, btch:  31 usd:  
0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154018] CPU    8: hi:  186, btch:  31 usd:  
0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154019] CPU    9: hi:  186, btch:  31 usd:  
0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154020] CPU   10: hi:  186, btch:  31 usd:  
0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154021] CPU   11: hi:  186, btch:  31 usd:  
0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154023] CPU   12: hi:  186, btch:  31 usd:  
0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154024] CPU   13: hi:  186, btch:  31 usd:  
0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154025] CPU   14: hi:  186, btch:  31 usd:  
0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154026] CPU   15: hi:  186, btch:  31 usd:  
0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154028] active_anon:7546116 inactive_anon:3718
isolated_anon:0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154028]  active_file:405 inactive_file:19 isolated_file:0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154028]  unevictable:5 dirty:193 writeback:0
unstable:0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154028]  free:47219 slab_reclaimable:15089 slab_unreclaimable:22997
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154028]  mapped:6952 shmem:7403 pagetables:22940
bounce:0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154028]  free_cma:0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154031] Node 0 DMA free:15904kB min:32kB low:40kB
high:48kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB
isolated(anon):0kB isolated(file):0kB present:15988kB managed:15904kB mlocked:0kB dirty:0kB
writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB
pagetables:0kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable?
yes
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154035] lowmem_reserve[]: 0 3744 30129 30129
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154038] Node 0 DMA32 free:113892kB min:8392kB
low:10488kB high:12588kB active_anon:3689064kB inactive_anon:1468kB active_file:260kB inactive_file:20kB
unevictable:4kB isolated(anon):0kB isolated(file):0kB present:3915776kB managed:3836720kB
mlocked:4kB dirty:104kB writeback:0kB mapped:4468kB shmem:4496kB slab_reclaimable:6640kB slab_unreclaimable:9296kB
kernel_stack:3832kB pagetables:10816kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB
pages_scanned:451 all_unreclaimable? yes
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154041] lowmem_reserve[]: 0 0 26385 26385
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154044] Node 0 Normal free:59080kB min:59152kB
low:73940kB high:88728kB active_anon:26495400kB inactive_anon:13404kB active_file:1360kB inactive_file:56kB
unevictable:16kB isolated(anon):0kB isolated(file):0kB present:27525120kB managed:27019008kB
mlocked:16kB dirty:668kB writeback:0kB mapped:23340kB shmem:25116kB slab_reclaimable:53716kB
slab_unreclaimable:82692kB kernel_stack:33392kB pagetables:80944kB unstable:0kB bounce:0kB
free_cma:0kB writeback_tmp:0kB pages_scanned:2483 all_unreclaimable? yes
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154047] lowmem_reserve[]: 0 0 0 0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154049] Node 0 DMA: 0*4kB 0*8kB 0*16kB 1*32kB
(U) 2*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (R) 3*4096kB (M) = 15904kB
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154058] Node 0 DMA32: 167*4kB (UEM) 1991*8kB
(UEM) 590*16kB (UEM) 275*32kB (UEM) 204*64kB (UEM) 139*128kB (UEM) 66*256kB (UEM) 32*512kB
(EM) 11*1024kB (UEM) 2*2048kB (EM) 0*4096kB = 114324kB
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154068] Node 0 Normal: 15098*4kB (UEM) 36*8kB
(EM) 3*16kB (EM) 1*32kB (E) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 60760kB
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154076] Node 0 hugepages_total=0 hugepages_free=0
hugepages_surp=0 hugepages_size=2048kB
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154077] 7651 total pagecache pages
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154078] 0 pages in swap cache
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154079] Swap cache stats: add 0, delete 0, find
0/0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154080] Free swap  = 0kB
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154081] Total swap = 0kB
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154082] 7864221 pages RAM
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154083] 0 pages HighMem/MovableOnly
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154084] 126528 pages reserved
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344897] [ pid ]   uid  tgid total_vm      rss
nr_ptes swapents oom_score_adj name
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344918] [  740]     0   740     4868       49
     13        0             0 upstart-udev-br
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344920] [  747]     0   747    12521      234
     27        0         -1000 systemd-udevd
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344922] [  912]     0   912     3814       51
     12        0             0 upstart-socket-
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344924] [  960]     0   960     2554      574
      8        0             0 dhclient
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344926] [ 1256]     0  1256     3818       55
     13        0             0 upstart-file-br
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344928] [ 1402]     0  1402     3633       41
     12        0             0 getty
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344930] [ 1405]     0  1405     3633       40
     12        0             0 getty
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344932] [ 1407]   101  1407    65017      688
     29        0             0 rsyslogd
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344934] [ 1410]     0  1410     3633       42
     12        0             0 getty
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344935] [ 1411]     0  1411     3633       40
     12        0             0 getty
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344937] [ 1413]     0  1413     3633       39
     10        0             0 getty
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344939] [ 1437]     0  1437    15344      172
     34        0         -1000 sshd
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344941] [ 1465]     0  1465     4783       40
     13        0             0 atd
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344943] [ 1466]     0  1466     5912       53
     17        0             0 cron
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344944] [ 1481]     0  1481     1091       36
      7        0             0 acpid
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344946] [ 1490]   102  1490     9802      100
     24        0             0 dbus-daemon
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344948] [ 1501]     0  1501    10861       89
     25        0             0 systemd-logind
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344950] [ 1507]     0  1507     4863      112
     14        0             0 irqbalance
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344952] [ 1608]     0  1608    26411      252
     54        0             0 sshd
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344954] [ 1729]  1000  1729    26999      847
     56        0             0 sshd
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344956] [ 1936]     0  1936     3633       39
     12        0             0 getty
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344957] [ 1937]     0  1937     3195       38
     12        0             0 getty
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344959] [ 2445]   106  2445     7863      151
     19        0             0 ntpd
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344961] [ 3564]   108  3564    33045     1466
     55        0             0 postgres
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344962] [ 3566]   108  3566    33073     5407
     65        0             0 postgres
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344964] [ 3567]   108  3567    33045      331
     54        0             0 postgres
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344966] [ 3568]   108  3568    33045      528
     52        0             0 postgres
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344968] [ 3569]   108  3569    33253      534
     54        0             0 postgres
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344969] [ 3570]   108  3570    25222      376
     49        0             0 postgres
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344971] [ 3797]  1000  3797  2656388    54163
    197        0             0 java
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344973] [ 3840]  1000  3840     2826       97
      9        0             0 bash
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344975] [14391]     0 14391    26410      245
     53        0             0 sshd
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344977] [14454]  1000 14454    26410      252
     51        0             0 sshd
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344979] [14455]  1000 14455     5660      837
     16        0             0 bash
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344981] [18767]  1000 18767   421689    75153
    280        0             0 java
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344982] [18818]  1000 18818   427646    72071
    276        0             0 java
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344984] [18845]  1000 18845   422808    76361
    287        0             0 java
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344986] [18986]  1000 18986   413940    85847
    272        0             0 java
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344988] [19231]  1000 19231   423114    68900
    237        0             0 java
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344989] [19255]  1000 19255   422304    98381
    297        0             0 java
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344991] [19281]  1000 19281   453916    50170
    285        0             0 java
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344993] [19307]  1000 19307   421858    73572
    251        0             0 java
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344994] [20004]  1000 20004  2625480    62614
    227        0             0 java
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344996] [20059]  1000 20059  2417786   642515
   1907        0             0 kudu-tserver
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344998] [20075]  1000 20075   170158     4632
    136        0             0 kudu-master
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344999] [20091]  1000 20091  2534506   681406
   2028        0             0 kudu-tserver
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345001] [20100]  1000 20100  2480736   678210
   1996        0             0 kudu-tserver
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345003] [21180]  1000 21180     2812       85
     10        0             0 bash
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345004] [21194]  1000 21194  2131760    52636
    186        0             0 java
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345006] [21277]  1000 21277     3354      114
     11        0             0 bash
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345008] [21291]  1000 21291  2176412    96677
    367        0             0 java
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345010] [21441]  1000 21441     3354      114
     12        0             0 bash
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345011] [21455]  1000 21455  2171189    84863
    323        0             0 java
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345013] [21619]  1000 21619     3354      115
     11        0             0 bash
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345015] [21633]  1000 21633  2174403   126755
    412        0             0 java
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345016] [21773]  1000 21773     3354      115
     10        0             0 bash
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345018] [21787]  1000 21787  2165621   105043
    368        0             0 java
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345019] [22327]  1000 22327   699864    64647
    359        0             0 java
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345021] [22650]   108 22650    34060     1947
     62        0             0 postgres
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345023] [22651]   108 22651    34067     1837
     61        0             0 postgres
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345025] [22695]   108 22695    34564     5014
     67        0             0 postgres
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345026] [22696]   108 22696    34668     5491
     67        0             0 postgres
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345028] [22701]  1000 22701   499743   165103
    749        0             0 java
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345030] [22966]  1000 22966   404221    82336
    258        0             0 java
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345031] [49266]   108 49266    34579     5136
     67        0             0 postgres
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345033] [49267]   108 49267    34487     5004
     67        0             0 postgres
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345034] [49434]   108 49434    34298     1980
     61        0             0 postgres
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345036] [49435]   108 49435    34140     1836
     60        0             0 postgres
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345037] [49438]   108 49438    34575     5159
     67        0             0 postgres
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345039] [49439]   108 49439    34505     5146
     67        0             0 postgres
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345041] [49441]   108 49441    34590     5236
     67        0             0 postgres
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345042] [49442]   108 49442    34553     5045
     67        0             0 postgres
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345044] [49486]   108 49486    34507     5106
     67        0             0 postgres
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345046] [49487]   108 49487    34573     5240
     67        0             0 postgres
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345048] [12270]  1000 12270     3345      105
     11        0             0 bash
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345049] [12634]  1000 12634   106243     2424
    105        0             0 statestored
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345051] [12642]  1000 12642  2177880    69774
    301        0             0 catalogd
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345052] [12708]  1000 12708  2599753    71304
    505        0             0 impalad
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345054] [12775]  1000 12775  2599480    69011
    503        0             0 impalad
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345055] [12844]  1000 12844  2599354    70697
    503        0             0 impalad
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345057] [13564]  1000 13564     3411      159
     12        0             0 bash
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345059] [13565]  1000 13565     2623      114
     11        0             0 make
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345060] [13568]  1000 13568     6091      149
     16        0             0 ctest
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345062] [70617]     0 70617    26410      246
     55        0             0 sshd
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345063] [71103]  1000 71103    26444      247
     53        0             0 sshd
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345065] [71113]  1000 71113     5628      806
     16        0             0 bash
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345066] [ 2745]  1000  2745  2286602   253086
    810        0             0 buffered-tuple-
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345068] [ 3922]  1000  3921  5818436  3452952
   6822        0             0 free-pool-test
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345070] Out of memory: Kill process 3922 (free-pool-test)
score 448 or sacrifice child
{code}

and shortly after, the output of meminfo:
{code}
ubuntu@ip-172-31-7-2:~/Impala/logs/be_tests$ cat /proc/meminfo 
MemTotal:       30871632 kB
MemFree:         7011044 kB
Buffers:           40700 kB
Cached:          5438488 kB
SwapCached:            0 kB
Active:         21124408 kB
Inactive:        2125936 kB
Active(anon):   17793440 kB
Inactive(anon):     7516 kB
Active(file):    3330968 kB
Inactive(file):  2118420 kB
Unevictable:          20 kB
Mlocked:              20 kB
SwapTotal:             0 kB
SwapFree:              0 kB
Dirty:              1132 kB
Writeback:             0 kB
AnonPages:      17781632 kB
Mapped:           138604 kB
Shmem:             29780 kB
Slab:             276868 kB
SReclaimable:     182520 kB
SUnreclaim:        94348 kB
KernelStack:       39152 kB
PageTables:        66140 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:    15435816 kB
Committed_AS:   49437784 kB
VmallocTotal:   34359738367 kB
VmallocUsed:       72396 kB
VmallocChunk:   34359655000 kB
HardwareCorrupted:     0 kB
AnonHugePages:  15386624 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
DirectMap4k:       38912 kB
DirectMap2M:     3237888 kB
DirectMap1G:    28311552 kB
{code}

We should probably have larger VMs for these jobs, but may also need to consider reducing
the mem needed for BE tests.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message