hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From sandeep das <yarnhad...@gmail.com>
Subject Data spilling on disk from MR jobs
Date Wed, 18 Nov 2015 12:33:40 GMT

I'm running my pig script over YARN(MR2). I was going through some tuning
parameter and find out that the value of parameter
"mapreduce.task.io.sort.mb" should be tuned properly. By default it is
configured to 256 MB in my cloudera setup.

I would wish to know that how can I find whether my MR jobs are spilling
data into disk or not. Are there any logs which can help me to find how
much data was spilled over disk? Is there any parameter which can be
configured to enable such logging.

CDH: CDH-5.4.4-1.cdh5.4.4.p0.4
Hadoop: 2.6.0-cdh5.4.4

Let me know in case more information is required.


View raw message