hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Denny Ye (Created) (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HDFS-3152) Reading consistency for all readers
Date Mon, 26 Mar 2012 08:22:27 GMT
Reading consistency for all readers 

                 Key: HDFS-3152
                 URL: https://issues.apache.org/jira/browse/HDFS-3152
             Project: Hadoop HDFS
          Issue Type: Bug
          Components: hdfs client
    Affects Versions: 0.21.0, 0.20.2
            Reporter: Denny Ye

I met an exception when I would like to seek to latest size of file that another client was
writing. Message is "Cannot seek after EOF". I got the seek target from previous input stream
and now I trying to obtains the file incremental. It means the target over than the file size

In my opinion, the confirmed visible file length comes from the completed blocks(NameNode)
plus replied size in last DataNode of pipeline for last block. 

Here are two cases: 1. How to obtains the confirmed visible file length to all readers. 2.
For each reader, how can we pick out the best DN for concrete block. 

Actually, existing code mix up those two parts. NameNode sorted block locations due to local
reading(HBase or local MapReduce, random DataNode for outer reader). DFSClient obtains the
first DataNode of last block. Pay attention to this point! Client may obtains the 'dirty'
file length from frist DN of last block that NameNode returned. And client always uses the
frist DN for each block to read file content.

Should we split two cases?

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message