hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rajesh Balamohan <>
Subject Re: Oversized container estimation
Date Sat, 26 Nov 2016 01:39:51 GMT
Those are cumulative figures in the DAG level. You may want to check the gc
logs emitted at task level to check the details on whether complete memory
is used or not. Not sure what is the yarn-min container size specified in
your cluster. But based on that, you may run into the risk of running too
many containers in same node by lowering the container size (e.g 49
containers in 98 GB machine with 2 GB as hive container size & yarn
min-container size. If you have only 32 CPU in your system, this would end
up over subscribing a lot and could adversely impact job performance).


On Fri, Nov 25, 2016 at 11:03 PM, Ranjan Banerjee <>

> Hi everyone,
> I have a cluster where each container is configured at 4GB and some of my
> queries are getting over in 30 to 40 seconds. This leads me to believe that
> I have too much memory for my containers and I am thinking of reducing the
> container size to 1.5GB(hive.tez.container.size) but I am looking for a few
> more concrete data points to find out if really I have oversized containers?
> I looked into the tez view of my DAG and the counters give me:
> VIRTUAL_MEMORY_BYTES 1560263561216
> I am guessing this is wrong as there is no way the query could finish in
> 20 seconds on a 98GB cluster if the actual memory required by the query is
> 907GB. Any help to find some data points regarding determination of
> oversized containers is very much appreciated!
> Thanks
> Ranjan


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message