hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zhe Zhang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-6867) For DFSOutputStream, do pipeline recovery for a single block in the background
Date Thu, 11 Sep 2014 16:55:35 GMT

    [ https://issues.apache.org/jira/browse/HDFS-6867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14130284#comment-14130284
] 

Zhe Zhang commented on HDFS-6867:
---------------------------------

The test failure seems unrelated. The recovery worker is not enabled in {{TestPipelinesFailover}}.
I tried that unit test and it passed locally.

Basically, when notified by {{RecoveryWorker}}, the {{DataStreamer}} does the following to
recover the pipeline ({{setupPipelineForAppendOrRecovery}} is also modified to always try
to add new DN when the state is {{UNDER_RECV}}). Since we are not following the existing recovery
logic of catching DN errors from {{ResponseProcessor}} and running {{processDatanodeError}},
I want to makc sure nothing is missing. In particular, I'd like to discuss how to set the
value of {{stage}}. I think it should be {{PIPELINE_SETUP_STREAMING_RECOVERY}}, but let me
know if it should be something else.

{code:java}
          // Before sending each packet, check if the block is under
          // background recovery
          if (backgroundRecoveryState.get() ==
              HdfsConstants.BackgroundRecoveryState.UNDER_RECOV) {
            ...
            closeStream();
            try {
              stage = BlockConstructionStage.PIPELINE_SETUP_STREAMING_RECOVERY;
              setupPipelineForAppendOrRecovery();
            } catch (IOException e) {
              DFSClient.LOG.warn("Caught exception ", e);
            }
            ...
          }
{code}

> For DFSOutputStream, do pipeline recovery for a single block in the background
> ------------------------------------------------------------------------------
>
>                 Key: HDFS-6867
>                 URL: https://issues.apache.org/jira/browse/HDFS-6867
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs-client
>            Reporter: Colin Patrick McCabe
>            Assignee: Zhe Zhang
>         Attachments: HDFS-6867-20140827-2.patch, HDFS-6867-20140827-3.patch, HDFS-6867-20140827.patch,
HDFS-6867-20140828-1.patch, HDFS-6867-20140828-2.patch, HDFS-6867-20140910.patch, HDFS-6867-design-20140820.pdf,
HDFS-6867-design-20140821.pdf, HDFS-6867-design-20140822.pdf, HDFS-6867-design-20140827.pdf,
HDFS-6867-design-20140910.pdf
>
>
> For DFSOutputStream, we should be able to do pipeline recovery in the background, while
the user is continuing to write to the file.  This is especially useful for long-lived clients
that write to an HDFS file slowly. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message