hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jakub Stransky <stransky...@gmail.com>
Subject CPU utilization
Date Fri, 12 Sep 2014 15:51:23 GMT
Hello experienced hadoop users,

I have one beginners question regarding cpu utilization on datanodes when
running MR job. Cluster of 5 machines, 2NN +3 DN really inexpensive hw
using following parameters:
# hadoop - yarn-site.xml
yarn.nodemanager.resource.memory-mb  : 2048
yarn.scheduler.minimum-allocation-mb : 256
yarn.scheduler.maximum-allocation-mb : 2048

# hadoop - mapred-site.xml
mapreduce.map.memory.mb              : 768
mapreduce.map.java.opts              : -Xmx512m
mapreduce.reduce.memory.mb           : 1024
mapreduce.reduce.java.opts           : -Xmx768m
mapreduce.task.io.sort.mb            : 100
yarn.app.mapreduce.am.resource.mb    : 1024
yarn.app.mapreduce.am.command-opts   : -Xmx768m

and I have map only task which uses 3 mappers which are essentially
distributed across the cluster - 1 task per dn. What I see on the cluster
nodes is that cpu utilization doesn't overcome 30%.

Am I right and hadoop do really limit all the resources per container
bases? I wasn't able to find any command/setting which would prove this
theory. ulimit for yarn were unlimited, etc.

Not sure if I am missing something here

Thanks for providing more insight into resource planning and utilization
Jakub

Mime
View raw message