hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uma Maheswara Rao G (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-3222) DFSInputStream#openInfo should not silently get the length as 0 when locations length is zero for last partial block.
Date Mon, 09 Apr 2012 17:35:17 GMT

    [ https://issues.apache.org/jira/browse/HDFS-3222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13249990#comment-13249990
] 

Uma Maheswara Rao G commented on HDFS-3222:
-------------------------------------------

{quote}
On the first hflush() for a block, it calls NN.fsync(), which internally calls persistBlocks().
Currently, the fsync call doesn't give a length, but perhaps it could?
{quote}
My point is, even though client flushed the data, DNs will not report to NN right. Did you
check the test above?
I have changed the code as per our proposal and debugged as well. It was persisting length
as 0.


{quote}
The other thought is that, after a restart, a block that was previously being written would
be in the under construction state, but with no expectedTargets. This differs from the case
where a block has been allocated but not yet written to replicas. We could use that to set
a new flag in the LocatedBlock response indicating that it's not a 0-length,
{quote}
I was thinking in the same lines:-). I got your point. I feel this would be possible to do.
But my actual question is, after we distinguish, what we can do from client?
You mean we will retry until we get the locations? If yes, there would be another problem,

because when 
1) client wants to read some partial data which exists in first block itself,
2) open may try to get complete length, and that will block if we retry until DNs reports
to NN.
3) But really that DNs down for long time.
This time, we can not read even until the specified length, which is less than the start offset
of partial block. 

Your suggestion might be something else here. what is your thought here?
                
> DFSInputStream#openInfo should not silently get the length as 0 when locations length
is zero for last partial block.
> ---------------------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-3222
>                 URL: https://issues.apache.org/jira/browse/HDFS-3222
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs client
>    Affects Versions: 1.0.3, 2.0.0, 3.0.0
>            Reporter: Uma Maheswara Rao G
>            Assignee: Uma Maheswara Rao G
>         Attachments: HDFS-3222-Test.patch
>
>
> I have seen one situation with Hbase cluster.
> Scenario is as follows:
> 1)1.5 blocks has been written and synced.
> 2)Suddenly cluster has been restarted.
> Reader opened the file and trying to get the length., By this time partial block contained
DNs are not reported to NN. So, locations for this partial block would be 0. In this case,
DFSInputStream assumes that, 1 block size as final size.
> But reader also assuming that, 1 block size is the final length and setting his end marker.
Finally reader ending up reading only partial data. Due to this, HMaster could not replay
the complete edits. 
> Actually this happend with 20 version. Looking at the code, same should present in trunk
as well.
> {code}
>     int replicaNotFoundCount = locatedblock.getLocations().length;
>     
>     for(DatanodeInfo datanode : locatedblock.getLocations()) {
> ..........
> ..........
>  // Namenode told us about these locations, but none know about the replica
>     // means that we hit the race between pipeline creation start and end.
>     // we require all 3 because some other exception could have happened
>     // on a DN that has it.  we want to report that error
>     if (replicaNotFoundCount == 0) {
>       return 0;
>     }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message