hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Devaraj Das (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-611) SequenceFile.Sorter should have a merge method that returns an iterator
Date Fri, 27 Oct 2006 23:39:19 GMT
    [ http://issues.apache.org/jira/browse/HADOOP-611?page=comments#action_12445297 ] 
Devaraj Das commented on HADOOP-611:

In the current merge code, 'merge-factor' number of keys & values are kept in memory.
While implementing this, one thought was that we can prevent all the 'merge-factor' values
from being in memory at the same time and fetch them when needed. When the user of the merge
code does a next() on the MergeQueue to fetch the key/value, the system loads in memory the
value corresponding to the 'minimum' key and defers the loading of the value until then.
Implemented this for Compression = NONE & RECORD. However, for BLOCK compression, the
code for not proactively loading values is already there and controlled by a boolean "lazyDecompression"
and nothing extra needs to be done. The thing is lazyDecompression is controlled via hadoop
config (defaulting to true). I was thinking whether it makes good sense to remove this configurable
item and have it as true always.
Any objection to this?

> SequenceFile.Sorter should have a merge method that returns an iterator
> -----------------------------------------------------------------------
>                 Key: HADOOP-611
>                 URL: http://issues.apache.org/jira/browse/HADOOP-611
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: io
>            Reporter: Owen O'Malley
>         Assigned To: Devaraj Das
>             Fix For: 0.8.0
> SequenceFile.Sorter should get a new merge method that returns an iterator over the keys/values.
> The current merge method should become a simple method that gets the iterator and writes
the records out to a file.

This message is automatically generated by JIRA.
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message