Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: hdfs-issues@hadoop.apache.org
Date: Mon, 22 Sep 2014 18:44:34 +0000 (UTC)
From: "Colin Patrick McCabe (JIRA)" <jira@apache.org>
To: hdfs-issues@hadoop.apache.org
Message-ID: <JIRA.12735207.1408488007000.96284.1411411474966@Atlassian.JIRA>
In-Reply-To: <JIRA.12735207.1408488007000@Atlassian.JIRA>
References: <JIRA.12735207.1408488007000@Atlassian.JIRA>
 <JIRA.12735207.1408488007198@arcas>
Subject: [jira] [Commented] (HDFS-6877) Avoid calling checkDisk when an HDFS
 volume is removed during a write.
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


    [ https://issues.apache.org/jira/browse/HDFS-6877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14143608#comment-14143608 ] 

Colin Patrick McCabe commented on HDFS-6877:
--------------------------------------------

[~eddyxu]: I renamed this JIRA based on our offline discussion.  It seems that the main advantage here is that we can avoid the checkDisk operation that would otherwise be triggered.

We also talked about some alternate approaches, such as reference counting the HDFS volumes and not removing them until all references were gone.  But we decided that these approaches were not worth the complexity.

{code}
  /**
   * Finalizes the block previously opened for writing using writeToBlock.
   * The block size is what is in the parameter b and it must match the amount
   *  of data written
   * @throws IOException
   */
  public void finalizeBlock(ExtendedBlock b) throws IOException;
{code}
Can we add a "throws" line here that explains that this function throws {{ReplicaNotFoundException}} when the block being finalized resides on an HDFS volume that has been removed, or when the replica otherwise cannot be found?  All implementations of {{FSDatasetSpi}} should honor this contract.

{code}
diff --git hadoop-hdfs-project/hadoop-hdfs/src/main/proto/datatransfer.proto hadoop-hdfs-project/hadoop-hdfs/src/main/proto/datatransfer.proto
index 098d10a..56d8148 100644
--- hadoop-hdfs-project/hadoop-hdfs/src/main/proto/datatransfer.proto
+++ hadoop-hdfs-project/hadoop-hdfs/src/main/proto/datatransfer.proto
@@ -212,7 +212,7 @@ enum Status {
   CHECKSUM_OK = 6;
   ERROR_UNSUPPORTED = 7;
   OOB_RESTART = 8;            // Quick restart
-  OOB_RESERVED1 = 9;          // Reserved
+  OOB_INTERRUPTED = 9;        // Interrupted
{code}
Do we still need a special Status code, or can we use the {{ERROR}} status code plus a custom message?

> Avoid calling checkDisk when an HDFS volume is removed during a write.
> ----------------------------------------------------------------------
>
>                 Key: HDFS-6877
>                 URL: https://issues.apache.org/jira/browse/HDFS-6877
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: datanode
>    Affects Versions: 2.5.0
>            Reporter: Lei (Eddy) Xu
>            Assignee: Lei (Eddy) Xu
>         Attachments: HDFS-6877.000.consolidate.txt, HDFS-6877.000.delta-HDFS-6727.txt, HDFS-6877.001.combo.txt, HDFS-6877.001.patch, HDFS-6877.002.patch, HDFS-6877.003.patch
>
>
> Avoid calling checkDisk when an HDFS volume is removed during a write.


--
This message was sent by Atlassian JIRA
(v6.3.4#6332)