hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jim Kellerman (JIRA)" <j...@apache.org>
Subject [jira] Created: (HADOOP-603) Extend SequenceFile to provide MapFile function by storing index at the end of the file
Date Sat, 14 Oct 2006 15:11:35 GMT
Extend SequenceFile to provide MapFile function by storing index at the end of the file
---------------------------------------------------------------------------------------

                 Key: HADOOP-603
                 URL: http://issues.apache.org/jira/browse/HADOOP-603
             Project: Hadoop
          Issue Type: Improvement
          Components: dfs
            Reporter: Jim Kellerman


MapFile increases the load on the name node as two files are created to provide a index file
format. If SequenceFile were extended by storing the index at the end of the file, 1/2 of
the files currently created for a map/reduce operation would be needed, reducing the load
on the name node.

Perhaps this is why Google implemented SSTable files in this manner. (SSTable files are functionally
identical to Hadoop MapFiles) (see the paper on BigTable - section 4 "Building Blocks" http://labs.google.com/papers/bigtable.html)

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message