hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrzej Bialecki (JIRA)" <j...@apache.org>
Subject [jira] Created: (HADOOP-106) Data blocks should be record-oriented.
Date Sat, 25 Mar 2006 20:21:19 GMT
Data blocks should be record-oriented.
--------------------------------------

         Key: HADOOP-106
         URL: http://issues.apache.org/jira/browse/HADOOP-106
     Project: Hadoop
        Type: Wish
  Components: dfs  
    Versions: 0.2    
    Reporter: Andrzej Bialecki 


If data blocks were starting and ending on data record boundaries, and not in random places
within a file, it would give some important advantages:

* it would be possible to avoid "fishing" for the beginning of first record in a split (see
SequenceFile.Reader.sync()).

* it would make recovering from DFS errors much more successful and easier - in most cases
missing blocks could be just skipped and the remaining parts combined together.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


Mime
View raw message