hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Saurabh Dutta <saurabh.du...@impetus.co.in>
Subject RE: Spilled Records
Date Tue, 22 Feb 2011 03:46:51 GMT
Hi Maha,

The spilled record has to do with the transient data during the map and reduce operations.
Note that it's not just the map operations that generate the spilled records. When the in-memory
buffer (controlled by mapred.job.shuffle.merge.percent) runs out or reaches the threshold
number of map outputs (mapred.inmem.merge.threshold), it is merged and spilled to disk.

You are going in the right direction by tuning the io.sort.mb parameter and try increasing
it further. If it still doesn't work out, try the io.sort.factor, fs.inmemory.size.mb. Also,
try the other two variables that i mentioned earlier.

Let us know what worked for you.

Sincerely,
Saurabh Dutta
Impetus Infotech India Pvt. Ltd.,
Sarda House, 24-B, Palasia, A.B.Road, Indore - 452 001
Phone: +91-731-4269200 4623
Fax: + 91-731-4071256
Email: saurabh.dutta@impetus.co.in
www.impetus.com
________________________________________
From: maha [maha@umail.ucsb.edu]
Sent: Tuesday, February 22, 2011 8:21 AM
To: common-user
Subject: Spilled Records

Hello every one,

 Does spilled records mean that the sort-buffer size for sorting is not enough to sort all
the input records, hence some records are written to local disk ?

 If so, I tried setting my io.sort.mb from the default 100 to 200 and there was still the
same # of spilled records. Why ?

 Does changing io.sort.record.percent to be .9 instead .8 might produce unexpected exceptions
?


Thank you,
Maha

________________________________

Impetus to Present Big Data -- Analytics Solutions and Strategies at TDWI World Conference
(Feb 13-18) in Las Vegas.We are also bringing cloud experts together at CloudCamp, Delhi on
Feb 12. CloudCamp is an unconference where early adopters of Cloud Computing technologies
exchange ideas.

Click http://www.impetus.com to know more.


NOTE: This message may contain information that is confidential, proprietary, privileged or
otherwise protected by law. The message is intended solely for the named addressee. If received
in error, please destroy and notify the sender. Any use of this email is prohibited when received
in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this
communication has been maintained nor that the communication is free of errors, virus, interception
or interference.

Mime
View raw message