hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zhe Zhang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-8383) Tolerate multiple failures in DFSStripedOutputStream
Date Thu, 03 Sep 2015 23:38:46 GMT

    [ https://issues.apache.org/jira/browse/HDFS-8383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14730042#comment-14730042
] 

Zhe Zhang commented on HDFS-8383:
---------------------------------

Continuing the review on the patch itself:
# Reading the current single-failure handling logic again, I think the {{BlockRecoveryTrigger}}
should work. We are making the recovery transactional by setting the streamer's {{externalError}}
before {{updateBlockForPipeline}} and resetting it after {{updatePipeline}}. I think it's
the right approach at this stage.
# Why not incrementing {{numScheduled}} if it's already positive?
{code}
+        if (numScheduled == 0) {
+          numScheduled++;
+        }
{code}
# The error handling logic is quite complex now. We should use this chance to add more explanation.
Below is my draft. [~walter.k.su] If it looks OK to you, could you help add to the patch?
{code}
  class Coordinator {
    /**
     * The next internal block to write to, allocated by the fastest streamer
     * (earliest to finish writing the current internal block) by calling
     * {@link StripedDataStreamer#locateFollowingBlock}.
     */
    private final MultipleBlockingQueue<LocatedBlock> followingBlocks;

    /**
     * Records the number of bytes actually written to the most recent internal
     * block. Used to calculate the size of the entire block group.
     */
    private final MultipleBlockingQueue<ExtendedBlock> endBlocks;

    /**
     * The following 2 queues are used to handle stream failures.
     *
     * When stream_i fails, the OutputStream notifies all other healthy
     * streamers by setting an external error on each of them, which triggers
     * {@link DataStreamer#processDatanodeError}. The first streamer reaching
     * the external error will call {@link DataStreamer#updateBlockForPipeline}
     * to get a new block with bumped generation stamp, and populate
     * {@link newBlocks} for other streamers. This first streamer will also
     * call {@link DataStreamer#updatePipeline} to update the NameNode state
     * for the block.
     */
    private final MultipleBlockingQueue<LocatedBlock> newBlocks;
    private final MultipleBlockingQueue<ExtendedBlock> updateBlocks;
{code}
# Naming suggestions:
{code}
BlockRecoveryTrigger -> PipelineRecoveryManager or PipelineRecoveryCoordinator (I don't
have a strong opinion but we can also consider moving the class under Coordinator).
trigger() -> addRecoveryWork()
isRecovering() -> isUnderRecovery()
{code}
# The patch at HDFS-8704 removes the {{setFailed}} API, we need to coordinate the 2 efforts.
Pinging [~libo-intel] for comments.

Follow-ons:
# {{waitLastRecoveryToFinish}} can be improved. The current logic waits for the slowest streamer
to get out of {{externalError}} state.
# {{externalError}} is actually quite awkward in {{DataStreamer}} -- it's a null concept for
non-EC {{DataStreamer}}.

> Tolerate multiple failures in DFSStripedOutputStream
> ----------------------------------------------------
>
>                 Key: HDFS-8383
>                 URL: https://issues.apache.org/jira/browse/HDFS-8383
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Tsz Wo Nicholas Sze
>            Assignee: Walter Su
>         Attachments: HDFS-8383.00.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message