Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 820F0200B56 for ; Sat, 30 Jul 2016 13:07:22 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 80891160A8B; Sat, 30 Jul 2016 11:07:22 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id D1381160A81 for ; Sat, 30 Jul 2016 13:07:21 +0200 (CEST) Received: (qmail 61503 invoked by uid 500); 30 Jul 2016 11:07:21 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 61482 invoked by uid 99); 30 Jul 2016 11:07:20 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 30 Jul 2016 11:07:20 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id A5B062C0D5D for ; Sat, 30 Jul 2016 11:07:20 +0000 (UTC) Date: Sat, 30 Jul 2016 11:07:20 +0000 (UTC) From: "Brahma Reddy Battula (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HDFS-6937) Another issue in handling checksum errors in write pipeline MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Sat, 30 Jul 2016 11:07:22 -0000 [ https://issues.apache.org/jira/browse/HDFS-6937?page=3Dcom.atlassian.= jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D15400= 633#comment-15400633 ]=20 Brahma Reddy Battula commented on HDFS-6937: -------------------------------------------- Thanks for [~yzhangal] and [~jojochuang] thanks for deeper look here... We had come across one issue, where write is failed even 7 DN=E2=80=99s are= available due to network fault at one datanode which is LAST_IN_PIPELINE. Scenario : (DN3 has N/W Fault and Min repl=3D2). Write pipeline: DN1->DN2->DN3 =3D> DN3 Gives ERROR_CHECKSUM ack. And so DN2 marked as bad DN1->DN4-> DN3 =3D> DN3 Gives ERROR_CHECKSUM ack. And so DN4 is marked as b= ad =E2=80=A6. And so on ( all the times DN3 is LAST_IN_PIPELINE) ... Continued till no mo= re datanodes to construct the pipeline. Thinking we can handle like below: Instead of throwing IOException for ERROR_CHECKSUM ack from downstream, If = we can send back the pipeline ack and client side we can replace both DN2 a= nd DN3 with new nodes as we can=E2=80=99t decide on which is having network= problem. > Another issue in handling checksum errors in write pipeline > ----------------------------------------------------------- > > Key: HDFS-6937 > URL: https://issues.apache.org/jira/browse/HDFS-6937 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, hdfs-client > Affects Versions: 2.5.0 > Reporter: Yongjun Zhang > Assignee: Wei-Chiu Chuang > Attachments: HDFS-6937.001.patch, HDFS-6937.002.patch > > > Given a write pipeline: > DN1 -> DN2 -> DN3 > DN3 detected cheksum error and terminate, DN2 truncates its replica to th= e ACKed size. Then a new pipeline is attempted as > DN1 -> DN2 -> DN4 > DN4 detects checksum error again. Later when replaced DN4 with DN5 (and s= o on), it failed for the same reason. This led to the observation that DN2'= s data is corrupted.=20 > Found that the software currently truncates DN2's replca to the ACKed siz= e after DN3 terminates. But it doesn't check the correctness of the data al= ready written to disk. > So intuitively, a solution would be, when downstream DN (DN3 here) found = checksum error, propagate this info back to upstream DN (DN2 here), DN2 che= cks the correctness of the data already written to disk, and truncate the r= eplica to to MIN(correctDataSize, ACKedSize). > Found this issue is similar to what was reported by HDFS-3875, and the tr= uncation at DN2 was actually introduced as part of the HDFS-3875 solution.= =20 > Filing this jira for the issue reported here. HDFS-3875 was filed by [~tl= ipcon] > and found he proposed something similar there. > {quote} > if the tail node in the pipeline detects a checksum error, then it return= s a special error code back up the pipeline indicating this (rather than ju= st disconnecting) > if a non-tail node receives this error code, then it immediately scans it= s own block on disk (from the beginning up through the last acked length). = If it detects a corruption on its local copy, then it should assume that it= is the faulty one, rather than the downstream neighbor. If it detects no c= orruption, then the faulty node is either the downstream mirror or the netw= ork link between the two, and the current behavior is reasonable. > {quote} > Thanks. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org