Return-Path: X-Original-To: apmail-hadoop-common-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-common-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 68D38175C3 for ; Mon, 4 May 2015 15:21:07 +0000 (UTC) Received: (qmail 87908 invoked by uid 500); 4 May 2015 15:21:07 -0000 Delivered-To: apmail-hadoop-common-issues-archive@hadoop.apache.org Received: (qmail 87855 invoked by uid 500); 4 May 2015 15:21:07 -0000 Mailing-List: contact common-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-issues@hadoop.apache.org Delivered-To: mailing list common-issues@hadoop.apache.org Received: (qmail 87844 invoked by uid 99); 4 May 2015 15:21:07 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 04 May 2015 15:21:07 +0000 Date: Mon, 4 May 2015 15:21:07 +0000 (UTC) From: "Yi Liu (JIRA)" To: common-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HADOOP-11847) Enhance raw coder allowing to read least required inputs in decoding MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HADOOP-11847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14526741#comment-14526741 ] Yi Liu commented on HADOOP-11847: --------------------------------- Hi Zhe, for striped block recovery, there are several situations: 1) only parity blocks missed 2) only data blocks missed 3) both parity and data blocks missed. Before this patch commit, In HDFS-7348, for #1, I use encode as workaround, but it will encode all parity blocks. For #2, I found decode only works for data blocks, and the erasureIndices needs some special handle, see the decode test, so in HDFS-7348, in the test I made parityBlkNum of data blocks missed, then it works, but we need to have full inputs and allocate more buffers. For #3, it doesn't work and there is no test. So if without this fix, in HDFS-7348, HDFS-7678, the decode is just workaround and we still need to update after this patch. Even the decode interface is the same, but there is different requirements for the input parameters, so the code logic will be different. Should we review and push this patch as soon as possible? It's a block issue. Ideally for {{decode}}, the input should be: 1) minimal input blocks (may include data or parity blocks), 2) Indices of input blocks, or some way to let decode function know, 3) output is blocks to be recovered (one or more), 4) Indices of output blocks. > Enhance raw coder allowing to read least required inputs in decoding > -------------------------------------------------------------------- > > Key: HADOOP-11847 > URL: https://issues.apache.org/jira/browse/HADOOP-11847 > Project: Hadoop Common > Issue Type: Sub-task > Components: io > Reporter: Kai Zheng > Assignee: Kai Zheng > Attachments: HADOOP-11847-HDFS-7285-v3.patch, HADOOP-11847-HDFS-7285-v4.patch, HADOOP-11847-v1.patch, HADOOP-11847-v2.patch > > > This is to enhance raw erasure coder to allow only reading least required inputs while decoding. It will also refine and document the relevant APIs for better understanding and usage. When using least required inputs, it may add computating overhead but will possiblly outperform overall since less network traffic and disk IO are involved. > This is something planned to do but just got reminded by [~zhz]' s question raised in HDFS-7678, also copied here: > bq.Kai Zheng I have a question about decoding: in a (6+3) schema, if block #2 is missing, and I want to repair it with blocks 0, 1, 3, 4, 5, 8, how should I construct the inputs to RawErasureDecoder#decode? > With this work, hopefully the answer to above question would be obvious. -- This message was sent by Atlassian JIRA (v6.3.4#6332)