hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ajay Srivastava <Ajay.Srivast...@guavus.com>
Subject io.sort.factor
Date Wed, 23 Jan 2013 06:23:07 GMT

io.sort.factor  --  The number of streams to merge at once while sorting files. This determines
the number of open file handles.

How can I use this parameter to improve performance of mapreduce job?
My understanding from above description was If there are many spill records then increasing
io.sort.mb as well as io.sort.factor will help in better performance. Increasing io.sort.mb
helped but changing io.sort.factor (> 10) does not seem to improve/degrade performance
of mapred  job.

Ajay Srivastava
View raw message