hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Colin Patrick McCabe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-7235) Can not decommission DN which has invalid block due to bad disk
Date Wed, 22 Oct 2014 20:21:36 GMT

    [ https://issues.apache.org/jira/browse/HDFS-7235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14180466#comment-14180466

Colin Patrick McCabe commented on HDFS-7235:

Hi Yongjun,

Thanks for your patience here.  I don't think the current patch is quite ready.  I could point
to a few things, like this:  {{ReplicaInfo replicaInfo = (ReplicaInfo) data.getReplica(}}
 We shouldn't be downcasting here.

I think the bigger issue is that the interface in FsDatasetSpi is just not very suitable to
what we're trying to do.  Rather than trying to hack it, I think we should come up with a
better interface.

I think we should replace {{FsDatasetSpi#isValid}} with this function:

   * Check if a block is valid.
   * @param b           The block to check.
   * @param minLength   The minimum length that the block must have.  May be 0.
   * @param state       If this is null, it is ignored.  If it is non-null, we
   *                        will check that the replica has this state.
   * @throws FileNotFoundException             If the replica is not found or there 
   *                                              was an error locating it.
   * @throws EOFException                      If the replica length is too short.
   * @throws UnexpectedReplicaStateException   If the replica is not in the 
   *                                             expected state.
  public void checkBlock(ExtendedBlock b, long minLength, ReplicaState state);

Since this function will throw a clearly marked exception detailing which case we're in, we
won't have to call multiple functions.  This will be better for performance since we're only
taking the lock once.  This will also be better for clarity, since the current APIs lead to
some rather complex code.

We could also get rid of {{FsDatasetSpi#isValidRbw}}, since this function can do everything
that it can.
Also UnexpectedReplicaStateException could be a new exception under hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/UnexpectedReplicaStateException.java

I think it's fine to change FsDatasetSpi for this (we did it when adding caching stuff, and
again when adding "trash").

Let me know what you think.  I think it would make things a lot more clear.

> Can not decommission DN which has invalid block due to bad disk
> ---------------------------------------------------------------
>                 Key: HDFS-7235
>                 URL: https://issues.apache.org/jira/browse/HDFS-7235
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: datanode, namenode
>    Affects Versions: 2.6.0
>            Reporter: Yongjun Zhang
>            Assignee: Yongjun Zhang
>         Attachments: HDFS-7235.001.patch, HDFS-7235.002.patch, HDFS-7235.003.patch
> When to decommission a DN, the process hangs. 
> What happens is, when NN chooses a replica as a source to replicate data on the to-be-decommissioned
DN to other DNs, it favors choosing this DN to-be-decommissioned as the source of transfer
(see BlockManager.java).  However, because of the bad disk, the DN would detect the source
block to be transfered as invalidBlock with the following logic in FsDatasetImpl.java:
> {code}
> /** Does the block exist and have the given state? */
>   private boolean isValid(final ExtendedBlock b, final ReplicaState state) {
>     final ReplicaInfo replicaInfo = volumeMap.get(b.getBlockPoolId(), 
>         b.getLocalBlock());
>     return replicaInfo != null
>         && replicaInfo.getState() == state
>         && replicaInfo.getBlockFile().exists();
>   }
> {code}
> The reason that this method returns false (detecting invalid block) is because the block
file doesn't exist due to bad disk in this case. 
> The key issue we found here is, after DN detects an invalid block for the above reason,
it doesn't report the invalid block back to NN, thus NN doesn't know that the block is corrupted,
and keeps sending the data transfer request to the same DN to be decommissioned, again and
again. This caused an infinite loop, so the decommission process hangs.
> Thanks [~qwertymaniac] for reporting the issue and initial analysis.

This message was sent by Atlassian JIRA

View raw message