hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Purtell <apurt...@apache.org>
Subject Re: WALPlayer kills many RS when play large number of WALs
Date Tue, 22 Jul 2014 16:12:22 GMT
You need to better manage the colocation of the mapreduce runtime. In other
words, you are allowing mapreduce to grab too many node resources,
resulting in activation of the kernel's OOM killer. A good rule of thumb is
the aggregate of all Java heaps (daemons like DataNOde, RegionServer,
NodeManager, etc. + the max allowed number of mapreduce jobs * task heap
setting). Reduce the allowed mapreduce task concurrency.


On Tue, Jul 22, 2014 at 8:15 AM, Tianying Chang <tychang@gmail.com> wrote:

> Hi
>
> I was running WALPlayer that output HFile for future bulkload. There are
> 6200 hlogs, and the total size is about 400G.
>
> The mapreduce job finished. But I saw two bad things:
> 1. More than half of RS died. I checked the syslog, it seems they are
> killed by OOM. They also have very high CPU spike for the whole time during
> WALPlayer
>
> cpu user usage of 84.4% matches resource limit [cpu user usage>70.0%]
>
> 2. Mapreduce job also has failure of Java heap Space error. My job set the
> heap usage as 2G,
> *mapred.child.java.opts*-Xmx2048m
> Does this mean WALPlayer cannot support this load on this kind of setting?
>
> Thanks
> Tian-Ying
>



-- 
Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein
(via Tom White)

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message