hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tsz Wo (Nicholas), SZE (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-6307) Support reading on un-closed SequenceFile
Date Mon, 12 Oct 2009 17:32:31 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-6307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12764758#action_12764758

Tsz Wo (Nicholas), SZE commented on HADOOP-6307:

> Not sure why this issue only hits SequenceFile. The problem applies equally to TFile
(although this was pushed to the caller).

This problem applies to any implementation which gets the un-closed file length by calling
fs.getFileStatus(file).getLen().  (By "problem", I mean that the reader may not see all hflushed
bytes.  It sees some part of the file.  This is the same behavior before append.)  I did not
check TFile before.  TFile does not have this problem if the caller manage to get the correct
length and pass it to the TFile.Reader constructor.

> Support reading on un-closed SequenceFile
> -----------------------------------------
>                 Key: HADOOP-6307
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6307
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: io
>            Reporter: Tsz Wo (Nicholas), SZE
> When a SequenceFile.Reader is constructed, it calls fs.getFileStatus(file).getLen().
 However, fs.getFileStatus(file).getLen() does not return the hflushed length for un-closed
file since the Namenode does not know the hflushed length.  DFSClient have to ask a datanode
for the length last block which is being written; see also HDFS-570.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message