hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Arun C Murthy (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HADOOP-6307) Support reading on un-closed SequenceFile
Date Tue, 01 Dec 2009 22:27:20 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-6307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Arun C Murthy updated HADOOP-6307:

    Status: Open  (was: Patch Available)

Only nit:

+    public Reader(Path file, FSDataInputStream in, int buffersize,
+        long start, long length, Configuration conf) throws IOException

should be

+    public Reader(FSDataInputStream in, int buffersize,
+        long start, long length, Configuration conf) throws IOException

Having the 'file' there is useless since we do not use it in the constructor, people might
get confused about usage, or worse assume that we will open the file again. The proposed alternative
will force people to think in the right direction i.e. they open the file and hand us the
input-stream and the start/length.

> Support reading on un-closed SequenceFile
> -----------------------------------------
>                 Key: HADOOP-6307
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6307
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: io
>            Reporter: Tsz Wo (Nicholas), SZE
>            Assignee: Tsz Wo (Nicholas), SZE
>         Attachments: c6307_20091124.patch, c6307_20091130.patch
> When a SequenceFile.Reader is constructed, it calls fs.getFileStatus(file).getLen().
 However, fs.getFileStatus(file).getLen() does not return the hflushed length for un-closed
file since the Namenode does not know the hflushed length.  DFSClient have to ask a datanode
for the length last block which is being written; see also HDFS-570.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message