hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adam Kawa <kawa.a...@gmail.com>
Subject Re: CPU utilization
Date Fri, 12 Sep 2014 16:23:15 GMT
Hi,

With these settings, your are able to start 2 containers maximally per
NodeManager (yarn.nodemanager.resource.memory-mb  = 2048). The size of your
containers is between 768 - 1024 MBs (not sure what is your value of
yarn.nodemanager.resource.cpu-vcores).
Have you tried to run more (or bigger) jobs on the cluster concurrently?
Then you might see higher CPU utilization than 30%.

Cheers!
Adam

2014-09-12 17:51 GMT+02:00 Jakub Stransky <stransky.ja@gmail.com>:

> Hello experienced hadoop users,
>
> I have one beginners question regarding cpu utilization on datanodes when
> running MR job. Cluster of 5 machines, 2NN +3 DN really inexpensive hw
> using following parameters:
> # hadoop - yarn-site.xml
> yarn.nodemanager.resource.memory-mb  : 2048
> yarn.scheduler.minimum-allocation-mb : 256
> yarn.scheduler.maximum-allocation-mb : 2048
>
> # hadoop - mapred-site.xml
> mapreduce.map.memory.mb              : 768
> mapreduce.map.java.opts              : -Xmx512m
> mapreduce.reduce.memory.mb           : 1024
> mapreduce.reduce.java.opts           : -Xmx768m
> mapreduce.task.io.sort.mb            : 100
> yarn.app.mapreduce.am.resource.mb    : 1024
> yarn.app.mapreduce.am.command-opts   : -Xmx768m
>
> and I have map only task which uses 3 mappers which are essentially
> distributed across the cluster - 1 task per dn. What I see on the cluster
> nodes is that cpu utilization doesn't overcome 30%.
>
> Am I right and hadoop do really limit all the resources per container
> bases? I wasn't able to find any command/setting which would prove this
> theory. ulimit for yarn were unlimited, etc.
>
> Not sure if I am missing something here
>
> Thanks for providing more insight into resource planning and utilization
> Jakub
>
>
>
>
>
>

Mime
View raw message