Mailing-List: contact hadoop-dev-help@lucene.apache.org; run by ezmlm
Precedence: bulk
Reply-To: hadoop-dev@lucene.apache.org
Message-ID: <2138966170.1143318079380.JavaMail.jira@ajax>
Date: Sat, 25 Mar 2006 20:21:19 +0000 (GMT)
From: "Andrzej Bialecki  (JIRA)" <jira@apache.org>
To: hadoop-dev@lucene.apache.org
Subject: [jira] Created: (HADOOP-106) Data blocks should be record-oriented.
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit

Data blocks should be record-oriented.
--------------------------------------

         Key: HADOOP-106
         URL: http://issues.apache.org/jira/browse/HADOOP-106
     Project: Hadoop
        Type: Wish
  Components: dfs  
    Versions: 0.2    
    Reporter: Andrzej Bialecki 


If data blocks were starting and ending on data record boundaries, and not in random places within a file, it would give some important advantages:

* it would be possible to avoid "fishing" for the beginning of first record in a split (see SequenceFile.Reader.sync()).

* it would make recovering from DFS errors much more successful and easier - in most cases missing blocks could be just skipped and the remaining parts combined together.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira