Return-Path: Delivered-To: apmail-hadoop-hbase-user-archive@minotaur.apache.org Received: (qmail 23313 invoked from network); 8 May 2009 18:54:52 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 8 May 2009 18:54:52 -0000 Received: (qmail 97607 invoked by uid 500); 8 May 2009 18:54:51 -0000 Delivered-To: apmail-hadoop-hbase-user-archive@hadoop.apache.org Received: (qmail 97588 invoked by uid 500); 8 May 2009 18:54:50 -0000 Mailing-List: contact hbase-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hbase-user@hadoop.apache.org Delivered-To: mailing list hbase-user@hadoop.apache.org Received: (qmail 97575 invoked by uid 99); 8 May 2009 18:54:50 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 08 May 2009 18:54:50 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of tmnichols@gmail.com designates 209.85.219.171 as permitted sender) Received: from [209.85.219.171] (HELO mail-ew0-f171.google.com) (209.85.219.171) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 08 May 2009 18:54:41 +0000 Received: by ewy19 with SMTP id 19so2152806ewy.29 for ; Fri, 08 May 2009 11:54:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:date:message-id:subject :from:to:content-type:content-transfer-encoding; bh=cB7VrsmVYT2dXjffdhGUkO50coGVEhaHHqGaJcwLkc0=; b=iZASEuR8NPH3YUCVXqx3AOiaZ+nmTN2tp1NzVIqd4GirvLGPvqJq83S10wyAtesFOG q9Qekf+JilHIbJo6efI96IarftzLpWxxs6VsYadjt1TB51aRxSloFs8U7lYDrJ1tGU6A 2pLgLFomK/4jHp9mMaYDgO8+dHqE6nC2dRjSM= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type :content-transfer-encoding; b=Fv6dslRRs/4Iu5Bu55sxF7yHvS6KP1yJki3mTS5BhcIyzLM7iBbv9bVypPHA/ylsN7 ZnNzWHCfCUflJNwY2ICWm9SrJXglbdW8o1+KVm0pm8JhcpszWExR7oxjF1AH/j2A1T9x fWIe3kVSSRD4PNMgp0gdeiwqJyztaAfDDKIec= MIME-Version: 1.0 Received: by 10.210.130.13 with SMTP id c13mr4808565ebd.94.1241808861022; Fri, 08 May 2009 11:54:21 -0700 (PDT) Date: Fri, 8 May 2009 14:54:21 -0400 Message-ID: Subject: MapReduce error From: Tom Nichols To: hbase-user Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org (Apologies if this is more appropriate for the Hadoop user list, but to be fair, my input and output are HBase tables...) I'm trying to run a MR on a 0.19.2 hbase cluster and I'm getting the following error: org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to contact region server 172.16.10.140:60020 for region Platts_Megawatt,,1241729216060, row '', but failed after 10 attempts. Exceptions: java.lang.NullPointerException java.lang.NullPointerException java.lang.NullPointerException java.lang.NullPointerException java.lang.NullPointerException java.lang.NullPointerException java.lang.NullPointerException java.lang.NullPointerException java.lang.NullPointerException java.lang.NullPointerException at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getRegionServerWithRetries(HConnectionManager.java:858) at org.apache.hadoop.hbase.client.HTable$ClientScanner.nextScanner(HTable.java:1594) ... RegionServer log: ----------------------------------------- 2009-05-08 14:05:36,464 ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: org.apache.hadoop.hbase.UnknownScannerException: Name: -1 2009-05-08 14:05:36,466 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server handler 9 on 60020, call next(-1, 30) from 172.16.10.95:60969: error: org.apache.hadoop.hbase.UnknownScannerException: Name: -1 org.apache.hadoop.hbase.UnknownScannerException: Name: -1 at org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1574) at sun.reflect.GeneratedMethodAccessor26.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:632) at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:912) ----------------------------------------- It happens that for this particular table, the datanode is on the same machine as the hbase master/namenode server. Now, I don't see any datanode errors for that exact same timeframe, but there are a couple other errors around that time: DataNode.log: ----------------------------------------------- 2009-05-08 14:30:52,605 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /172.16.10.140:50010, dest: /172.16.10.140:60214, bytes: 1915392, op: HDFS_READ, cliID: DFSClient_-221953257, srvID: DS-526908905-172.16.10.140-50010-1229537939612, blockid: blk_-8408339440397934538_221228 2009-05-08 14:30:52,605 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(172.16.10.140:50010, storageID=DS-526908905-172.16.10.140-50010-1229537939612, infoPort=50075, ipcPort=50020):Got exception while serving blk_-8408339440397934538_221228 to /172.16.10.140: java.net.SocketTimeoutException: 480000 millis timeout while waiting for channel to be ready for write. ch : java.nio.channels.SocketChannel[connected local=/172.16.10.140:50010 remote=/172.16.10.140:60214] at org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:185) at org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:159) at org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:198) at org.apache.hadoop.hdfs.server.datanode.BlockSender.sendChunks(BlockSender.java:293) at org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:387) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:179) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:94) at java.lang.Thread.run(Thread.java:619) 2009-05-08 14:30:52,605 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(172.16.10.140:50010, storageID=DS-526908905-172.16.10.140-50010-1229537939612, infoPort=50075, ipcPort=50020):DataXceiver java.net.SocketTimeoutException: 480000 millis timeout while waiting for channel to be ready for write. ch : java.nio.channels.SocketChannel[connected local=/172.16.10.140:50010 remote=/172.16.10.140:60214] at org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:185) at org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:159) at org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:198) at org.apache.hadoop.hdfs.server.datanode.BlockSender.sendChunks(BlockSender.java:293) at org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:387) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:179) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:94) at java.lang.Thread.run(Thread.java:619) ------------------------------------------------- Any ideas? Thanks in advance. -Tom