hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uma Maheswara Rao G (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-3222) DFSInputStream#openInfo should not silently get the length as 0 when locations length is zero for last partial block.
Date Sat, 07 Apr 2012 02:09:17 GMT

    [ https://issues.apache.org/jira/browse/HDFS-3222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13249132#comment-13249132

Uma Maheswara Rao G commented on HDFS-3222:

Yes, I think that should solve the problem.

As we are persisting the number bytes for block, we can use block size of that partial block.
Let me write some tests to replicate this.

public static void writeCompactBlockArray(
      Block[] blocks, DataOutputStream out) throws IOException {
    WritableUtils.writeVInt(out, blocks.length);
    Block prev = null;
    for (Block b : blocks) {
      long szDelta = b.getNumBytes() -
          (prev != null ? prev.getNumBytes() : 0);
      long gsDelta = b.getGenerationStamp() -
          (prev != null ? prev.getGenerationStamp() : 0);
      out.writeLong(b.getBlockId()); // blockid is random
      WritableUtils.writeVLong(out, szDelta);
      WritableUtils.writeVLong(out, gsDelta);
      prev = b;

There are 2 cases here, 

1) found the loations, but not able to connect to any of them. then replicaNotFoundCount 
will be decrented on each trail. and if replicaNotFoundCount == 0, then it returns 0.
2) if there no loactions for that block. Then obviously replicaNotFoundCount will be 0 and
returns length as 0.

I think for both cases we can go for this persisted blockSize.

> DFSInputStream#openInfo should not silently get the length as 0 when locations length
is zero for last partial block.
> ---------------------------------------------------------------------------------------------------------------------
>                 Key: HDFS-3222
>                 URL: https://issues.apache.org/jira/browse/HDFS-3222
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs client
>    Affects Versions: 1.0.3, 2.0.0, 3.0.0
>            Reporter: Uma Maheswara Rao G
>            Assignee: Uma Maheswara Rao G
> I have seen one situation with Hbase cluster.
> Scenario is as follows:
> 1)1.5 blocks has been written and synced.
> 2)Suddenly cluster has been restarted.
> Reader opened the file and trying to get the length., By this time partial block contained
DNs are not reported to NN. So, locations for this partial block would be 0. In this case,
DFSInputStream assumes that, 1 block size as final size.
> But reader also assuming that, 1 block size is the final length and setting his end marker.
Finally reader ending up reading only partial data. Due to this, HMaster could not replay
the complete edits. 
> Actually this happend with 20 version. Looking at the code, same should present in trunk
as well.
> {code}
>     int replicaNotFoundCount = locatedblock.getLocations().length;
>     for(DatanodeInfo datanode : locatedblock.getLocations()) {
> ..........
> ..........
>  // Namenode told us about these locations, but none know about the replica
>     // means that we hit the race between pipeline creation start and end.
>     // we require all 3 because some other exception could have happened
>     // on a DN that has it.  we want to report that error
>     if (replicaNotFoundCount == 0) {
>       return 0;
>     }
> {code}

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message