Return-Path: X-Original-To: apmail-hadoop-hdfs-dev-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id F1E6C18C44 for ; Fri, 19 Feb 2016 06:35:19 +0000 (UTC) Received: (qmail 64180 invoked by uid 500); 19 Feb 2016 06:35:19 -0000 Delivered-To: apmail-hadoop-hdfs-dev-archive@hadoop.apache.org Received: (qmail 64041 invoked by uid 500); 19 Feb 2016 06:35:19 -0000 Mailing-List: contact hdfs-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-dev@hadoop.apache.org Delivered-To: mailing list hdfs-dev@hadoop.apache.org Received: (qmail 63644 invoked by uid 99); 19 Feb 2016 06:35:19 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 19 Feb 2016 06:35:19 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 51E152C1F5C for ; Fri, 19 Feb 2016 06:35:19 +0000 (UTC) Date: Fri, 19 Feb 2016 06:35:19 +0000 (UTC) From: "Kai Zheng (JIRA)" To: hdfs-dev@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Created] (HDFS-9833) Erasure coding: recomputing block checksum on the fly by reconstructing missed/corrupt block MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 Kai Zheng created HDFS-9833: ------------------------------- Summary: Erasure coding: recomputing block checksum on the fly by reconstructing missed/corrupt block Key: HDFS-9833 URL: https://issues.apache.org/jira/browse/HDFS-9833 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Kai Zheng As discussed in HDFS-8430 and HDFS-9694, to compute striped file checksum even some of striped blocks are missed, we need to consider recomputing block checksum on the fly for the missed/corrupt blocks. To recompute the block checksum, the block data needs to be reconstructed by erasure decoding, and the main needed codes for the block reconstruction could be borrowed from HDFS-9719, the refactoring of the existing {{ErasureCodingWorker}}. In EC worker, reconstructed blocks need to be written out to target datanodes, but here in this case, the remote writing isn't necessary, as the reconstructed block data is only used to recompute the checksum. -- This message was sent by Atlassian JIRA (v6.3.4#6332)