hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zhe Zhang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-6867) For DFSOutputStream, do pipeline recovery for a single block in the background
Date Thu, 11 Sep 2014 16:55:35 GMT

    [ https://issues.apache.org/jira/browse/HDFS-6867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14130284#comment-14130284

Zhe Zhang commented on HDFS-6867:

The test failure seems unrelated. The recovery worker is not enabled in {{TestPipelinesFailover}}.
I tried that unit test and it passed locally.

Basically, when notified by {{RecoveryWorker}}, the {{DataStreamer}} does the following to
recover the pipeline ({{setupPipelineForAppendOrRecovery}} is also modified to always try
to add new DN when the state is {{UNDER_RECV}}). Since we are not following the existing recovery
logic of catching DN errors from {{ResponseProcessor}} and running {{processDatanodeError}},
I want to makc sure nothing is missing. In particular, I'd like to discuss how to set the
value of {{stage}}. I think it should be {{PIPELINE_SETUP_STREAMING_RECOVERY}}, but let me
know if it should be something else.

          // Before sending each packet, check if the block is under
          // background recovery
          if (backgroundRecoveryState.get() ==
              HdfsConstants.BackgroundRecoveryState.UNDER_RECOV) {
            try {
              stage = BlockConstructionStage.PIPELINE_SETUP_STREAMING_RECOVERY;
            } catch (IOException e) {
              DFSClient.LOG.warn("Caught exception ", e);

> For DFSOutputStream, do pipeline recovery for a single block in the background
> ------------------------------------------------------------------------------
>                 Key: HDFS-6867
>                 URL: https://issues.apache.org/jira/browse/HDFS-6867
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs-client
>            Reporter: Colin Patrick McCabe
>            Assignee: Zhe Zhang
>         Attachments: HDFS-6867-20140827-2.patch, HDFS-6867-20140827-3.patch, HDFS-6867-20140827.patch,
HDFS-6867-20140828-1.patch, HDFS-6867-20140828-2.patch, HDFS-6867-20140910.patch, HDFS-6867-design-20140820.pdf,
HDFS-6867-design-20140821.pdf, HDFS-6867-design-20140822.pdf, HDFS-6867-design-20140827.pdf,
> For DFSOutputStream, we should be able to do pipeline recovery in the background, while
the user is continuing to write to the file.  This is especially useful for long-lived clients
that write to an HDFS file slowly. 

This message was sent by Atlassian JIRA

View raw message