hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Cutting <cutt...@apache.org>
Subject Re: Read MapFileOutputFormat output in ascending key order
Date Wed, 13 Feb 2008 19:08:30 GMT
Andrzej Bialecki wrote:
> Hmm ... the idea was to avoid the cost of additional I/O, and read the 
> parts directly as they are. If I understand it correctly, the 
> Sorter.merge() needs to rewrite the files in order to merge them, which 
> means a lot of I/O.

It only rewrites things if there are more parts than the mergefactor. 
So if you increase the mergefactor to the number of parts, then no data 
will be written.

Doug

Mime
View raw message