Return-Path: X-Original-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id B30661072A for ; Mon, 7 Sep 2015 07:26:47 +0000 (UTC) Received: (qmail 55595 invoked by uid 500); 7 Sep 2015 07:26:47 -0000 Delivered-To: apmail-hadoop-hdfs-issues-archive@hadoop.apache.org Received: (qmail 55539 invoked by uid 500); 7 Sep 2015 07:26:47 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-issues@hadoop.apache.org Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 55525 invoked by uid 99); 7 Sep 2015 07:26:47 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 07 Sep 2015 07:26:47 +0000 Date: Mon, 7 Sep 2015 07:26:47 +0000 (UTC) From: "Hudson (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HDFS-8960) DFS client says "no more good datanodes being available to try" on a single drive failure MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HDFS-8960?page=3Dcom.atlassian.= jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D14733= 333#comment-14733333 ]=20 Hudson commented on HDFS-8960: ------------------------------ FAILURE: Integrated in HBase-1.3 #152 (See [https://builds.apache.org/job/H= Base-1.3/152/]) HBASE-14317 Stuck FSHLog: bad disk (HDFS-8960) and can't roll WAL (stack: r= ev bbafb47f7271449d46b46569ca9f0cb227b44c6e) * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestL= ogRolling.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErr= orsExposed.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/LogRoller= .java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogK= ey.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/Damag= edWALException.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestFaile= dAppendAndSync.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestMulti= VersionConcurrencyControl.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestMulti= VersionConcurrencyControlBasic.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/Proto= bufLogWriter.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestWALLo= ckup.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSHLo= g.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegi= on.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSWAL= Entry.java * hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java * hbase-server/src/main/java/org/apache/hadoop/hbase/wal/WALKey.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MultiVers= ionConcurrencyControl.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/Proto= bufLogReader.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/SyncF= uture.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.j= ava > DFS client says "no more good datanodes being available to try" on a sing= le drive failure > -------------------------------------------------------------------------= ---------------- > > Key: HDFS-8960 > URL: https://issues.apache.org/jira/browse/HDFS-8960 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client > Affects Versions: 2.7.1 > Environment: openjdk version "1.8.0_45-internal" > OpenJDK Runtime Environment (build 1.8.0_45-internal-b14) > OpenJDK 64-Bit Server VM (build 25.45-b02, mixed mode) > Reporter: Benoit Sigoure > Attachments: blk_1073817519_77099.log, r12s13-datanode.log, r12s1= 6-datanode.log > > > Since we upgraded to 2.7.1 we regularly see single-drive failures cause w= idespread problems at the HBase level (with the default 3x replication targ= et). > Here's an example. This HBase RegionServer is r12s16 (172.24.32.16) and = is writing its WAL to [172.24.32.16:10110, 172.24.32.8:10110, 172.24.32.13:= 10110] as can be seen by the following occasional messages: > {code} > 2015-08-23 06:28:40,272 INFO [sync.3] wal.FSHLog: Slow sync cost: 123 ms= , current pipeline: [172.24.32.16:10110, 172.24.32.8:10110, 172.24.32.13:10= 110] > {code} > A bit later, the second node in the pipeline above is going to experience= an HDD failure. > {code} > 2015-08-23 07:21:58,720 WARN [DataStreamer for file /hbase/WALs/r12s16.s= jc.aristanetworks.com,9104,1439917659071/r12s16.sjc.aristanetworks.com%2C91= 04%2C1439917659071.default.1440314434998 block BP-1466258523-172.24.32.1-14= 37768622582:blk_1073817519_77099] hdfs.DFSClient: Error Recovery for block = BP-1466258523-172.24.32.1-1437768622582:blk_1073817519_77099 in pipeline 17= 2.24.32.16:10110, 172.24.32.13:10110, 172.24.32.8:10110: bad datanode 172.2= 4.32.8:10110 > {code} > And then HBase will go like "omg I can't write to my WAL, let me commit s= uicide". > {code} > 2015-08-23 07:22:26,060 FATAL [regionserver/r12s16.sjc.aristanetworks.com= /172.24.32.16:9104.append-pool1-t1] wal.FSHLog: Could not append. Requestin= g close of wal > java.io.IOException: Failed to replace a bad datanode on the existing pip= eline due to no more good datanodes being available to try. (Nodes: current= =3D[172.24.32.16:10110, 172.24.32.13:10110], original=3D[172.24.32.16:10110= , 172.24.32.13:10110]). The current failed datanode replacement policy is D= EFAULT, and a client may configure this via 'dfs.client.block.write.replace= -datanode-on-failure.policy' in its configuration. > at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDat= anode(DFSOutputStream.java:969) > at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanod= e2ExistingPipeline(DFSOutputStream.java:1035) > at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipel= ineForAppendOrRecovery(DFSOutputStream.java:1184) > at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDat= anodeError(DFSOutputStream.java:933) > at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOut= putStream.java:487) > {code} > Whereas this should be mostly a non-event as the DFS client should just d= rop the bad replica from the write pipeline. > This is a small cluster but has 16 DNs so the failed DN in the pipeline s= hould be easily replaced. I didn't set {{dfs.client.block.write.replace-da= tanode-on-failure.policy}} (so it's still {{DEFAULT}}) and didn't set {{dfs= .client.block.write.replace-datanode-on-failure.enable}} (so it's still {{t= rue}}). > I don't see anything noteworthy in the NN log around the time of the fail= ure, it just seems like the DFS client gave up or threw an exception back t= o HBase that it wasn't throwing before or something else, and that made thi= s single drive failure lethal. > We've occasionally be "unlucky" enough to have a single-drive failure cau= se multiple RegionServers to commit suicide because they had their WALs on = that drive. > We upgraded from 2.7.0 about a month ago, and I'm not sure whether we wer= e seeing this with 2.7 or not =E2=80=93 prior to that we were running in a = quite different environment, but this is a fairly new deployment. -- This message was sent by Atlassian JIRA (v6.3.4#6332)