hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matt Kowalczyk <matt.kowalc...@gmail.com>
Subject Re: yarn memory settings in heterogeneous cluster
Date Fri, 28 Aug 2015 19:41:43 GMT

Thanks for your prompt response. I'll take a look at the per-job memory
settings which from your explaining should resolve my issue.


On Fri, Aug 28, 2015 at 12:35 PM, Vinod Kumar Vavilapalli <
vinodkv@hortonworks.com> wrote:

> Hi Matt,
> Replies inline.
> > I'm using the Capacity Scheduler and deploy mapred-site.xml and
> yarn-site.xml configuration files with various memory settings that are
> tailored to the resources for a particular machine. The master node, and
> the two slave node classes each get a different configuration file since
> they have different memory profiles.
> We are improving this starting 2.8 so as to not require different
> configuration files - see https://issues.apache.org/jira/browse/YARN-160.
> > yarn.scheduler.minimum-allocation-mb: This appears to behave as a
> cluster-wide setting; however, due to my two node classes, a per-node
> yarn.scheduler.minimum-allocation-mb would be desirable.
> Actually the minimum container size is a cluster-level constant by design.
> It doesn’t matter how big or small nodes are in the cluster, the minimum
> size needs to be a constant for applications to have a notion of
> deterministic sizing. What we instead suggest is to simply run more
> containers on bigger machines using the yarn.nodemanage.resource.memory-mb
> configuration.
> On the other hand, maximum container-size obviously should at best be the
> size of the smallest node in the cluster. Otherwise, again, you may cause
> indeterministic scheduling behavior for apps.
> > More concretely, suppose I have two jobs with differing memory
> requirements--how would I communicate this to yarn and request that my
> containers be allocated with additional memory?
> This is a more apt ask. The minimum container size doesn’t determine
> container-size!. Containers can be of sizes of various multiples of the
> minimum, and driven by the application, or frameworks like MapReduce. For
> example, even if the container-size in the cluster is 1GB, MapReduce
> framework can ask bigger containers if user sets mapreduce.map.memory.mb to
> 2GB/4GB etc. And this is controllable at the job level!
> +Vinod

View raw message