Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Date: Mon, 16 May 2016 07:13:12 +0000 (UTC)
From: "Kai Zheng (JIRA)" <jira@apache.org>
To: hdfs-issues@hadoop.apache.org
Message-ID: <JIRA.12940402.1455863698000.194720.1463382792936@Atlassian.JIRA>
In-Reply-To: <JIRA.12940402.1455863698000@Atlassian.JIRA>
References: <JIRA.12940402.1455863698000@Atlassian.JIRA> <JIRA.12940402.1455863698525@arcas>
Subject: [jira] [Commented] (HDFS-9833) Erasure coding: recomputing block
 checksum on the fly by reconstructing the missed/corrupt block data
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
archived-at: Mon, 16 May 2016 07:13:14 -0000


    [ https://issues.apache.org/jira/browse/HDFS-9833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15284221#comment-15284221 ] 

Kai Zheng commented on HDFS-9833:
---------------------------------

Thanks [~rakeshr] for the big work and I will take some look giving my feedback.

bq. How about optimizing the checksum recomputation logic to address multiple datanode failures and reconstructing it together through another sub-task?
Sounds a good plan. Handling multiple block failures wouldn't impact big to the existing codes that can work for single block failure. This is similar to the ECWorker.

> Erasure coding: recomputing block checksum on the fly by reconstructing the missed/corrupt block data
> -----------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-9833
>                 URL: https://issues.apache.org/jira/browse/HDFS-9833
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Kai Zheng
>            Assignee: Rakesh R
>              Labels: hdfs-ec-3.0-must-do
>         Attachments: HDFS-9833-00-draft.patch
>
>
> As discussed in HDFS-8430 and HDFS-9694, to compute striped file checksum even some of striped blocks are missed, we need to consider recomputing block checksum on the fly for the missed/corrupt blocks. To recompute the block checksum, the block data needs to be reconstructed by erasure decoding, and the main needed codes for the block reconstruction could be borrowed from HDFS-9719, the refactoring of the existing {{ErasureCodingWorker}}. In EC worker, reconstructed blocks need to be written out to target datanodes, but here in this case, the remote writing isn't necessary, as the reconstructed block data is only used to recompute the checksum.


--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org