Return-Path: X-Original-To: apmail-hadoop-general-archive@minotaur.apache.org Delivered-To: apmail-hadoop-general-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id DA0DF4CFC for ; Fri, 13 May 2011 22:33:58 +0000 (UTC) Received: (qmail 95556 invoked by uid 500); 13 May 2011 22:33:57 -0000 Delivered-To: apmail-hadoop-general-archive@hadoop.apache.org Received: (qmail 95491 invoked by uid 500); 13 May 2011 22:33:57 -0000 Mailing-List: contact general-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: general@hadoop.apache.org Delivered-To: mailing list general@hadoop.apache.org Received: (qmail 95483 invoked by uid 99); 13 May 2011 22:33:57 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 13 May 2011 22:33:57 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of msegel@navteq.com designates 65.167.11.240 as permitted sender) Received: from [65.167.11.240] (HELO xmailchi.navteq.com) (65.167.11.240) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 13 May 2011 22:33:50 +0000 Received: from chicago-spamfw.navteq.com (imailchi [10.8.5.161]) by xmailchi.navteq.com (8.13.8/8.13.6) with ESMTP id p4DMXT90030667 for ; Fri, 13 May 2011 17:33:29 -0500 Received: from imailchi.hq.navteq.com (localhost [127.0.0.1]) by chicago-spamfw.navteq.com (Spam & Virus Firewall) with ESMTP id AC0FA4422F17 for ; Fri, 13 May 2011 17:33:28 -0500 (CDT) Received: from imailchi.hq.navteq.com ([10.8.120.104]) by chicago-spamfw.navteq.com with ESMTP id ldKdSsY4BjgkxuDE for ; Fri, 13 May 2011 17:33:28 -0500 (CDT) Received: from hq-ex-ht01.ad.navteq.com (hq-ex-ht01.ad.navteq.com [10.8.222.51]) by imailchi.hq.navteq.com (8.12.9/8.12.9) with ESMTP id p4DMXSNe023532 for ; Fri, 13 May 2011 17:33:28 -0500 Received: from hq-ex-mb03.ad.navteq.com ([fe80::c4dd:7b21:5c22:cfe4]) by hq-ex-ht01.ad.navteq.com ([fe80::55a6:5b94:5c00:fe39%12]) with mapi; Fri, 13 May 2011 17:33:28 -0500 From: "Segel, Mike" To: "general@hadoop.apache.org" Date: Fri, 13 May 2011 17:33:40 -0500 Subject: Re: Stability issue - dead DN's Thread-Topic: Stability issue - dead DN's Thread-Index: AcwRvcdfcMDGgpIHSLChYJEq3fXs1w== Message-ID: References: In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: acceptlanguage: en-US Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-NAIMIME-Disclaimer: 1 X-NAIMIME-Modified: 1 X-Virus-Checked: Checked by ClamAV on apache.org Ok... Hum, look, I've been force fed a couple of margaritas so, my memory is a = bit foggy... You say your clients connect on nic A. Your cluster connects on nic B. What happens when you want to upload a file from your client to HDFS? Or = even access it? =2E.. ;-) Sent from a remote device. Please excuse any typos... Mike Segel On May 13, 2011, at 4:15 PM, "Evert Lammerts" wr= ote: > Hi Mike, >=20 > Thanks for trying to help out. >=20 > I had a talk with our networking guys this afternoon. According to them= (and this is way out of my area of expertise, so excuse any mistakes) mu= ltiple interfaces shouldn't be a problem. We could set up a nameserver to= resolve hostnames to addresses in our private space when the request com= es from one of the nodes, and route this traffic over a single interface.= Any other request can be resolved to an address in the public space, whi= ch is bound to an other interface. In our current setup we're not even re= solving hostnames in our private address space through a nameserver - we = do it with an ugly hack in /etc/hosts. And it seems to work alright. >=20 > Having said that, our problems are still not completely gone even after= adjusting the maximum allowed RAM for tasks - although things are lots b= etter. While writing this mail three out of five DN's were marked as dead= =2E There still is some swapping going on, but the cores are not spending= any time in WAIT, so this shouldn't be the cause of anything. See below = a trace from a dead DN - any thoughts are appreciated! >=20 > Cheers, > Evert >=20 > 2011-05-13 23:13:27,716 INFO org.apache.hadoop.hdfs.server.datanode.Dat= aNode: Received block blk_-9131821326787012529_2915672 src: /192.168.28.2= 11:60136 dest: /192.168.28.214:50050 of size 382425 > 2011-05-13 23:13:27,915 INFO org.apache.hadoop.hdfs.server.datanode.Dat= aNode: Exception in receiveBlock for block blk_-9132067116195286882_13088= 8 java.io.EOFException: while trying to read 3744913 bytes > 2011-05-13 23:13:27,925 INFO org.apache.hadoop.hdfs.server.datanode.Dat= aNode.clienttrace: src: /192.168.28.214:50050, dest: /192.168.28.214:3513= 9, bytes: 0, op: HDFS_READ, cliID: DFSClient_attempt_201105131125_0025_m_= 001437_0, offset: 196608, srvID: DS-443352839-145.100.2.183-50050-1291128= 673616, blockid: blk_-9163184839986480695_4112368, duration: 6254000 > 2011-05-13 23:13:28,032 INFO org.apache.hadoop.hdfs.server.datanode.Dat= aNode: Received block blk_-9149862728087355005_3793421 src: /192.168.28.2= 10:41197 dest: /192.168.28.214:50050 of size 245767 > 2011-05-13 23:13:28,033 WARN org.apache.hadoop.hdfs.server.datanode.Dat= aNode: Block blk_-9132067116195286882_130888 unfinalized and removed. > 2011-05-13 23:13:28,033 INFO org.apache.hadoop.hdfs.server.datanode.Dat= aNode: writeBlock blk_-9132067116195286882_130888 received exception java= =2Eio.EOFException: while trying to read 3744913 bytes > 2011-05-13 23:13:28,033 ERROR org.apache.hadoop.hdfs.server.datanode.Da= taNode: DatanodeRegistration(192.168.28.214:50050, storageID=3DDS-4433528= 39-145.100.2.183-50050-1291128673616, infoPort=3D50075, ipcPort=3D50020):= DataXceiver > java.io.EOFException: while trying to read 3744913 bytes > at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readToBu= f(BlockReceiver.java:270) > at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readNext= Packet(BlockReceiver.java:357) > at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveP= acket(BlockReceiver.java:378) > at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveB= lock(BlockReceiver.java:534) > at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock= (DataXceiver.java:417) > at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXc= eiver.java:122) > 2011-05-13 23:13:28,038 INFO org.apache.hadoop.hdfs.server.datanode.Dat= aNode.clienttrace: src: /192.168.28.214:50050, dest: /192.168.28.214:3291= 0, bytes: 0, op: HDFS_READ, cliID: DFSClient_attempt_201105131125_0025_m_= 001443_0, offset: 197632, srvID: DS-443352839-145.100.2.183-50050-1291128= 673616, blockid: blk_-9163184839986480695_4112368, duration: 4323000 > 2011-05-13 23:13:28,038 INFO org.apache.hadoop.hdfs.server.datanode.Dat= aNode.clienttrace: src: /192.168.28.214:50050, dest: /192.168.28.214:3513= 8, bytes: 0, op: HDFS_READ, cliID: DFSClient_attempt_201105131125_0025_m_= 001440_0, offset: 197120, srvID: DS-443352839-145.100.2.183-50050-1291128= 673616, blockid: blk_-9163184839986480695_4112368, duration: 5573000 > 2011-05-13 23:13:28,159 INFO org.apache.hadoop.hdfs.server.datanode.Dat= aNode.clienttrace: src: /192.168.28.214:50050, dest: /192.168.28.212:3857= 4, bytes: 0, op: HDFS_READ, cliID: DFSClient_attempt_201105131125_0025_m_= 001444_0, offset: 197632, srvID: DS-443352839-145.100.2.183-50050-1291128= 673616, blockid: blk_-9163184839986480695_4112368, duration: 16939000 > 2011-05-13 23:13:28,209 INFO org.apache.hadoop.hdfs.server.datanode.Dat= aNode: Received block blk_-9123390874940601805_2898225 src: /192.168.28.2= 10:44227 dest: /192.168.28.214:50050 of size 300441 > 2011-05-13 23:13:28,217 INFO org.apache.hadoop.hdfs.server.datanode.Dat= aNode.clienttrace: src: /192.168.28.214:50050, dest: /192.168.28.213:4236= 4, bytes: 0, op: HDFS_READ, cliID: DFSClient_attempt_201105131125_0025_m_= 001451_0, offset: 198656, srvID: DS-443352839-145.100.2.183-50050-1291128= 673616, blockid: blk_-9163184839986480695_4112368, duration: 5291000 > 2011-05-13 23:13:28,252 INFO org.apache.hadoop.hdfs.server.datanode.Dat= aNode.clienttrace: src: /192.168.28.214:50050, dest: /192.168.28.214:3293= 0, bytes: 0, op: HDFS_READ, cliID: DFSClient_attempt_201105131125_0025_m_= 001436_0, offset: 0, srvID: DS-443352839-145.100.2.183-50050-129112867361= 6, blockid: blk_-1800696633107072247_4099834, duration: 5099000 > 2011-05-13 23:13:28,256 INFO org.apache.hadoop.hdfs.server.datanode.Dat= aNode.clienttrace: src: /192.168.28.214:50050, dest: /192.168.28.213:4236= 3, bytes: 0, op: HDFS_READ, cliID: DFSClient_attempt_201105131125_0025_m_= 001458_0, offset: 199680, srvID: DS-443352839-145.100.2.183-50050-1291128= 673616, blockid: blk_-9163184839986480695_4112368, duration: 4945000 > 2011-05-13 23:13:28,257 INFO org.apache.hadoop.hdfs.server.datanode.Dat= aNode.clienttrace: src: /192.168.28.214:50050, dest: /192.168.28.214:3513= 7, bytes: 0, op: HDFS_READ, cliID: DFSClient_attempt_201105131125_0025_m_= 001436_0, offset: 196608, srvID: DS-443352839-145.100.2.183-50050-1291128= 673616, blockid: blk_-9163184839986480695_4112368, duration: 4159000 > 2011-05-13 23:13:28,258 INFO org.apache.hadoop.hdfs.server.datanode.Dat= aNode: Exception in receiveBlock for block blk_-9140444589483291821_35859= 75 java.io.EOFException: while trying to read 100 bytes > 2011-05-13 23:13:28,258 WARN org.apache.hadoop.hdfs.server.datanode.Dat= aNode: Block blk_-9140444589483291821_3585975 unfinalized and removed. > 2011-05-13 23:13:28,258 INFO org.apache.hadoop.hdfs.server.datanode.Dat= aNode: writeBlock blk_-9140444589483291821_3585975 received exception jav= a.io.EOFException: while trying to read 100 bytes > 2011-05-13 23:13:28,259 ERROR org.apache.hadoop.hdfs.server.datanode.Da= taNode: DatanodeRegistration(192.168.28.214:50050, storageID=3DDS-4433528= 39-145.100.2.183-50050-1291128673616, infoPort=3D50075, ipcPort=3D50020):= DataXceiver > java.io.EOFException: while trying to read 100 bytes > at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readToBu= f(BlockReceiver.java:270) > at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readNext= Packet(BlockReceiver.java:357) > at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveP= acket(BlockReceiver.java:378) > at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveB= lock(BlockReceiver.java:534) > at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock= (DataXceiver.java:417) > at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXc= eiver.java:122) > 2011-05-13 23:13:28,264 INFO org.apache.hadoop.hdfs.server.datanode.Dat= aNode.clienttrace: src: /192.168.28.214:50050, dest: /192.168.28.212:3855= 3, bytes: 0, op: HDFS_READ, cliID: DFSClient_attempt_201105131125_0025_m_= 001441_0, offset: 0, srvID: DS-443352839-145.100.2.183-50050-129112867361= 6, blockid: blk_-5819719631677148140_4098274, duration: 5625000 > 2011-05-13 23:13:28,264 INFO org.apache.hadoop.hdfs.server.datanode.Dat= aNode.clienttrace: src: /192.168.28.214:50050, dest: /192.168.28.212:3853= 5, bytes: 0, op: HDFS_READ, cliID: DFSClient_attempt_201105131125_0025_m_= 001438_0, offset: 196608, srvID: DS-443352839-145.100.2.183-50050-1291128= 673616, blockid: blk_-9163184839986480695_4112368, duration: 4473000 > 2011-05-13 23:13:28,265 INFO org.apache.hadoop.hdfs.server.datanode.Dat= aNode: DatanodeRegistration(192.168.28.214:50050, storageID=3DDS-44335283= 9-145.100.2.183-50050-1291128673616, infoPort=3D50075, ipcPort=3D50020): = Exception writing block blk_-9150014886921014525_2267869 to mirror 192.16= 8.28.213:50050 > java.io.IOException: The stream is closed > at org.apache.hadoop.net.SocketOutputStream.write(SocketOutputSt= ream.java:108) > at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream= =2Ejava:65) > at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:= 123) > at java.io.DataOutputStream.flush(DataOutputStream.java:106) > at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveB= lock(BlockReceiver.java:540) > at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock= (DataXceiver.java:417) > at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXc= eiver.java:122) >=20 > 2011-05-13 23:13:28,265 INFO org.apache.hadoop.hdfs.server.datanode.Dat= aNode.clienttrace: src: /192.168.28.214:50050, dest: /192.168.28.213:4548= 4, bytes: 0, op: HDFS_READ, cliID: DFSClient_attempt_201105131125_0025_m_= 001432_0, offset: 0, srvID: DS-443352839-145.100.2.183-50050-129112867361= 6, blockid: blk_405051931214094755_4098504, duration: 5597000 > 2011-05-13 23:13:28,273 INFO org.apache.hadoop.hdfs.server.datanode.Dat= aNode: Received block blk_-9150014886921014525_2267869 src: /192.168.28.2= 11:49208 dest: /192.168.28.214:50050 of size 3033173 > 2011-05-13 23:13:28,313 INFO org.apache.hadoop.hdfs.server.datanode.Dat= aNode: Received block blk_-9144765354308563975_3310572 src: /192.168.28.2= 11:51592 dest: /192.168.28.214:50050 of size 242383 >=20 > ________________________________________ > From: Segel, Mike [msegel@navteq.com] > Sent: Friday, May 13, 2011 2:36 PM > To: general@hadoop.apache.org > Cc: ; > Subject: Re: Stability issue - dead DN's >=20 > Bonded will work but you may not see the performance you would expect. = If you need >1 GBe, go 10GBe less headache and has even more headroom. >=20 > Multiple interfaces won't work. Or I should say didn't work in past rel= eases. > If you think about it, clients have to connect to each node. So having = two interfaces and trying to manage them makes no sense. >=20 > Add to this trying to manage this in DNS ... Why make more work for you= rself? > Going from memory... It looked like you rDNS had to match you hostnames= so your internal interfaces had to match hostnames so you had an inverte= d network. >=20 > If you draw out your network topology you end up with a ladder. > You would be better off (IMHO) to create a subnet where only your edge = servers are dual nic'd. > But then if your cluster is for development... Now your PCs can't be us= ed as clients... >=20 > Does this make sense? >=20 >=20 > Sent from a remote device. Please excuse any typos... >=20 > Mike Segel >=20 > On May 13, 2011, at 4:57 AM, "Evert Lammerts" = wrote: >=20 >> Hi Mike, >>=20 >>> You really really don't want to do this. >>> Long story short... It won't work. >>=20 >> Can you elaborate? Are you talking about the bonded interfaces or abou= t having a separated network for interconnects and external network? What= can go wrong there? >>=20 >>>=20 >>> Just a suggestion.. You don't want anyone on your cluster itself. The= y >>> should interact wit edge nodes, which are 'Hadoop aware'. Then your >>> cluster has a single network to worry about. >>=20 >> That's our current setup. We have a single headnode that is used as a = SPOE. However, I'd like to change that on our future production system. W= e want to implement Kerberos for authentication, and let users interact w= ith the cluster from their own machine. This would enable them to submit = their jobs from the local IDE. The only way to do this is by opening up H= adoop ports for the world, is my understanding: if people interact with H= DFS they need to be able to interact with all nodes, right? What would be= the argument against this? >>=20 >> Cheers, >> Evert >>=20 >>>=20 >>>=20 >>> Sent from a remote device. Please excuse any typos... >>>=20 >>> Mike Segel >>>=20 >>> On May 11, 2011, at 11:45 AM, Allen Wittenauer wrote: >>>=20 >>>>=20 >>>>=20 >>>>=20 >>>>=20 >>>>>> * a 2x1GE bonded network interface for interconnects >>>>>> * a 2x1GE bonded network interface for external access >>>>=20 >>>> Multiple NICs on a box can sometimes cause big performance >>> problems with Hadoop. So watch your traffic carefully. >>>>=20 >>>>=20 >>>>=20 >=20 >=20 > The information contained in this communication may be CONFIDENTIAL and= is intended only for the use of the recipient(s) named above. If you ar= e not the intended recipient, you are hereby notified that any disseminat= ion, distribution, or copying of this communication, or any of its conten= ts, is strictly prohibited. If you have received this communication in e= rror, please notify the sender and delete/destroy the original message an= d any copy of it from your computer or paper files. The information contained in this communication may be CONFIDENTIAL and i= s intended only for the use of the recipient(s) named above. If you are = not the intended recipient, you are hereby notified that any disseminatio= n, distribution, or copying of this communication, or any of its contents= , is strictly prohibited. If you have received this communication in err= or, please notify the sender and delete/destroy the original message and = any copy of it from your computer or paper files.