hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Colin Patrick McCabe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-7722) DataNode#checkDiskError should also remove Storage when error is found.
Date Fri, 06 Mar 2015 23:51:38 GMT

    [ https://issues.apache.org/jira/browse/HDFS-7722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14351152#comment-14351152

Colin Patrick McCabe commented on HDFS-7722:

Eddy and I had an offline discussion about the use of {{Set<File>}} here.  It seems
that there is a pervasive assumption elsewhere in the code that FsVolumeSpi instances are
directories.  For example, in these interface methods:

  /** @return the base path to the volume */
  public String getBasePath();

  /** @return the path to the volume */
  public String getPath(String bpid) throws IOException;

  /** @return the directory for the finalized blocks in the block pool. */
  public File getFinalizedDir(String bpid) throws IOException;

So I think using {{Set<File>}} is OK here for now, since it fits in with the rest of
the code.  We will probably have to revisit this later, but it seems outside the scope of
this jira.

One thing I really like about this patch is the fact we no longer hold the {{FsDatasetImpl}}
mutex while scanning every volume.  This alone is a very important improvement.

I think it makes sense to leave the failure information around when removing volumes due to
the disk checker. 

685	    LOG.info("Deactivating volumes: " +
686	        Joiner.on(",").join(absoluteVolumePaths));

We should print out the value of {{clearFailure}} here.

+1 once that's addressed.

> DataNode#checkDiskError should also remove Storage when error is found.
> -----------------------------------------------------------------------
>                 Key: HDFS-7722
>                 URL: https://issues.apache.org/jira/browse/HDFS-7722
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: datanode
>    Affects Versions: 2.6.0
>            Reporter: Lei (Eddy) Xu
>            Assignee: Lei (Eddy) Xu
>         Attachments: HDFS-7722.000.patch, HDFS-7722.001.patch, HDFS-7722.002.patch
> When {{DataNode#checkDiskError}} found disk errors, it removes all block metadatas from
{{FsDatasetImpl}}. However, it does not removed the corresponding {{DataStorage}} and {{BlockPoolSliceStorage}}.

> The result is that, we could not directly run {{reconfig}} to hot swap the failure disks
without changing the configure file.

This message was sent by Atlassian JIRA

View raw message