Return-Path: Delivered-To: apmail-hbase-user-archive@www.apache.org Received: (qmail 6540 invoked from network); 29 Mar 2011 04:29:01 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 29 Mar 2011 04:29:01 -0000 Received: (qmail 83572 invoked by uid 500); 29 Mar 2011 04:28:59 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 83516 invoked by uid 500); 29 Mar 2011 04:28:59 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 83508 invoked by uid 99); 29 Mar 2011 04:28:58 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 29 Mar 2011 04:28:58 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of magnito@gmail.com designates 209.85.210.169 as permitted sender) Received: from [209.85.210.169] (HELO mail-iy0-f169.google.com) (209.85.210.169) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 29 Mar 2011 04:28:52 +0000 Received: by iyf13 with SMTP id 13so6086110iyf.14 for ; Mon, 28 Mar 2011 21:28:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:date :message-id:subject:from:to:content-type:content-transfer-encoding; bh=XVnXyoRuDqcvNIrWmCMypPANwVc7QRgEQtSw6hoXdOM=; b=IHKAEdit+8L9z0a+vNjPQQFQYISlwrPn4rNYtV3D7mvxEUT4Z+EHB41V6dJW8Xb105 wzaWkrTpELoNnAxj2vqJZ0iNJW8kElCuwQvz+JDgdspJqFibbl310wZXhQaH66ZNOC21 qGnkGzZiS/Pp9kMTOM4bRGWSFO4lr1VcL9t7E= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=qtHzQiaMt503sZM/EyG9HKOUeMud0p39xQPxMCgX1uNjsuSr7FaW6WMoJIyKhEt62U cHAT4KtgfLIi1Dn4fB0is15/yqOGaSHK6ZOaM/qQZLd0I/cUJUPXURYa9Sd0mZen31a/ FvDDb3La+H1yElkEmGYVLd9KJc8/9xFyMoq10= MIME-Version: 1.0 Received: by 10.43.62.133 with SMTP id xa5mr2727406icb.398.1301372911927; Mon, 28 Mar 2011 21:28:31 -0700 (PDT) Received: by 10.231.172.148 with HTTP; Mon, 28 Mar 2011 21:28:31 -0700 (PDT) In-Reply-To: References: Date: Mon, 28 Mar 2011 21:28:31 -0700 Message-ID: Subject: Re: hdfs /DN errors From: Jack Levin To: user@hbase.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Good Evening, anyone seen this in your logs? It could be something simple that we are missing. We also seeing that Datanodes can't be accessed from the webport 50075 every ones in a while. -Jack On Mon, Mar 28, 2011 at 4:19 PM, Jack Levin wrote: > Hello guys, we are getting those errors: > > > 2011-03-28 15:08:33,485 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: > /10.101.6.5:50010, dest: /10.101.6.5:51365, bytes: 66564, op: > HDFS_READ, cliI > D: DFSClient_hb_rs_rdaf5.prod.imageshack.com,60020,1301323415015_13013234= 15053, > offset: 4191232, srvID: DS-1528941561-10.101.6.5-50010-1299713950021, > blockid: blk_-30874978 > 22408705276_723501, duration: 14409579 > 2011-03-28 15:08:33,492 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: > /10.101.6.5:50010, dest: /10.101.6.5:51366, bytes: 14964, op: > HDFS_READ, cliI > D: DFSClient_hb_rs_rdaf5.prod.imageshack.com,60020,1301323415015_13013234= 15053, > offset: 67094016, srvID: DS-1528941561-10.101.6.5-50010-1299713950021, > blockid: blk_-3224146 > 686136187733_731011, duration: 8855000 > 2011-03-28 15:08:33,495 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: > /10.101.6.5:50010, dest: /10.101.6.5:51368, bytes: 51600, op: > HDFS_READ, cliI > D: DFSClient_hb_rs_rdaf5.prod.imageshack.com,60020,1301323415015_13013234= 15053, > offset: 0, srvID: DS-1528941561-10.101.6.5-50010-1299713950021, > blockid: blk_-63843345833451 > 99846_731014, duration: 2053969 > 2011-03-28 15:08:33,503 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: > /10.101.6.5:50010, dest: /10.101.6.5:42553, bytes: 462336, op: > HDFS_READ, cli > ID: DFSClient_hb_rs_rdaf5.prod.imageshack.com,60020,1301323415015_1301323= 415053, > offset: 327680, srvID: DS-1528941561-10.101.6.5-50010-1299713950021, > blockid: blk_-47512832 > 94726600221_724785, duration: 480254862706 > 2011-03-28 15:08:33,504 WARN > org.apache.hadoop.hdfs.server.datanode.DataNode: > DatanodeRegistration(10.101.6.5:50010, > storageID=3DDS-1528941561-10.101.6.5-50010-1299713950021, > =A0infoPort=3D50075, ipcPort=3D50020):Got exception while serving > blk_-4751283294726600221_724785 to /10.101.6.5: > java.net.SocketTimeoutException: 480000 millis timeout while waiting > for channel to be ready for write. ch : > java.nio.channels.SocketChannel[connected local=3D/10.101.6.5:500 > 10 remote=3D/10.101.6.5:42553] > =A0 =A0 =A0 =A0at org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(Soc= ketIOWithTimeout.java:246) > =A0 =A0 =A0 =A0at org.apache.hadoop.net.SocketOutputStream.waitForWritabl= e(SocketOutputStream.java:159) > =A0 =A0 =A0 =A0at org.apache.hadoop.net.SocketOutputStream.transferToFull= y(SocketOutputStream.java:198) > =A0 =A0 =A0 =A0at org.apache.hadoop.hdfs.server.datanode.BlockSender.send= Chunks(BlockSender.java:350) > =A0 =A0 =A0 =A0at org.apache.hadoop.hdfs.server.datanode.BlockSender.send= Block(BlockSender.java:436) > =A0 =A0 =A0 =A0at org.apache.hadoop.hdfs.server.datanode.DataXceiver.read= Block(DataXceiver.java:197) > =A0 =A0 =A0 =A0at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(= DataXceiver.java:110) > > 2011-03-28 15:08:33,504 ERROR > org.apache.hadoop.hdfs.server.datanode.DataNode: > DatanodeRegistration(10.101.6.5:50010, > storageID=3DDS-1528941561-10.101.6.5-50010-1299713950021 > , infoPort=3D50075, ipcPort=3D50020):DataXceiver > java.net.SocketTimeoutException: 480000 millis timeout while waiting > for channel to be ready for write. ch : > java.nio.channels.SocketChannel[connected local=3D/10.101.6.5:500 > 10 remote=3D/10.101.6.5:42553] > =A0 =A0 =A0 =A0at org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(Soc= ketIOWithTimeout.java:246) > =A0 =A0 =A0 =A0at org.apache.hadoop.net.SocketOutputStream.waitForWritabl= e(SocketOutputStream.java:159) > =A0 =A0 =A0 =A0at org.apache.hadoop.net.SocketOutputStream.transferToFull= y(SocketOutputStream.java:198) > =A0 =A0 =A0 =A0at org.apache.hadoop.hdfs.server.datanode.BlockSender.send= Chunks(BlockSender.java:350) > =A0 =A0 =A0 =A0at org.apache.hadoop.hdfs.server.datanode.BlockSender.send= Block(BlockSender.java:436) > =A0 =A0 =A0 =A0at org.apache.hadoop.hdfs.server.datanode.DataXceiver.read= Block(DataXceiver.java:197) > =A0 =A0 =A0 =A0at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(= DataXceiver.java:110) > 2011-03-28 15:08:33,504 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: > /10.101.6.5:50010, dest: /10.101.6.5:51369, bytes: 66564, op: > HDFS_READ, cliI > D: DFSClient_hb_rs_rdaf5.prod.imageshack.com,60020,1301323415015_13013234= 15053, > offset: 4781568, srvID: DS-1528941561-10.101.6.5-50010-1299713950021, > blockid: blk_-30874978 > 22408705276_723501, duration: 11478016 > 2011-03-28 15:08:33,506 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: > /10.101.6.5:50010, dest: /10.101.6.5:51370, bytes: 66564, op: > HDFS_READ, cliI > D: DFSClient_hb_rs_rdaf5.prod.imageshack.com,60020,1301323415015_13013234= 15053, > offset: 66962944, srvID: DS-1528941561-10.101.6.5-50010-1299713950021, > blockid: blk_-3224146 > 686136187733_731011, duration: 7643688 > > > RS talking to DN, and we are getting timeouts, there are no issues > like ulimit afaik, as we start them with 32k. =A0Any ideas what the deal > is? > > -Jack >