Return-Path: Delivered-To: apmail-hbase-dev-archive@www.apache.org Received: (qmail 94650 invoked from network); 14 Jan 2011 18:53:02 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 14 Jan 2011 18:53:02 -0000 Received: (qmail 30447 invoked by uid 500); 14 Jan 2011 18:53:01 -0000 Delivered-To: apmail-hbase-dev-archive@hbase.apache.org Received: (qmail 30144 invoked by uid 500); 14 Jan 2011 18:53:01 -0000 Mailing-List: contact dev-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hbase.apache.org Delivered-To: mailing list dev@hbase.apache.org Received: (qmail 29936 invoked by uid 99); 14 Jan 2011 18:53:01 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 14 Jan 2011 18:53:00 +0000 X-ASF-Spam-Status: No, hits=1.5 required=10.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of james.kennedy@troove.net designates 209.85.210.169 as permitted sender) Received: from [209.85.210.169] (HELO mail-iy0-f169.google.com) (209.85.210.169) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 14 Jan 2011 18:52:53 +0000 Received: by iyj17 with SMTP id 17so2866945iyj.14 for ; Fri, 14 Jan 2011 10:52:31 -0800 (PST) Received: by 10.42.172.137 with SMTP id n9mr444138icz.123.1295031151369; Fri, 14 Jan 2011 10:52:31 -0800 (PST) Received: from [10.0.1.3] (S01060016cbc6e72a.vc.shawcable.net [24.82.148.237]) by mx.google.com with ESMTPS id gy41sm1190006ibb.11.2011.01.14.10.52.28 (version=TLSv1/SSLv3 cipher=RC4-MD5); Fri, 14 Jan 2011 10:52:29 -0800 (PST) From: James Kennedy Mime-Version: 1.0 (Apple Message framework v1082) Content-Type: multipart/alternative; boundary=Apple-Mail-282-1043850496 Subject: Re: How to handle data migration? Date: Fri, 14 Jan 2011 10:52:26 -0800 In-Reply-To: <0F9D2315-33C2-44F5-994A-D43D03B380D6@gmail.com> To: dev@hbase.apache.org References: <94F6FBD2-D8F0-4644-8807-E40B84998440@troove.net> <0F9D2315-33C2-44F5-994A-D43D03B380D6@gmail.com> Message-Id: X-Mailer: Apple Mail (2.1082) X-Virus-Checked: Checked by ClamAV on apache.org --Apple-Mail-282-1043850496 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii Negative. I deleted the zookeeper dir and HMaser still managed to pull = the wrong IP address from somewhere. I don't have a lot of time to really investigate this myself but I'll = try to reproduce it with a basic test and log a case for it. By the way, can someone clarify the side-effects of deleting the = zookeeper dir like that? I assume it has no ill effect on the data = itself especially when the cluster is down. But what is the worst that = can happen if you delete the dir while the cluster is running? Thanks James On 2011-01-14, at 9:54 AM, Stack wrote: > It does seem like a regression. If u kill the zk data dir and = restart the cluster does it work? (root location is up in zk) >=20 >=20 > Stack >=20 >=20 >=20 > On Jan 13, 2011, at 11:37, James Kennedy = wrote: >=20 >> I'm currently validating the new 0.90.0 RC3 with the hbase-trx layer = and our own application. >>=20 >> All seems well so far except for the fact that I now find that HBase = doesn't adapt if I try to run the same data on different machines. >>=20 >> e.g. >> 1) I work from home and generated our seeded test data. >> 2) Run the test suite and all tests pass >> 3) I go to the office and re-run the tests. >>=20 >> Result: HMaster fails because the .ROOT data has the wrong ip address = for locating the .META. At least that is my understanding from the = stacktrace below. Note that the 192.168.1.102 IP address in that trace = is the IP from my home network and is incorrect. >>=20 >> This wasn't an issue with previous versions of HBase as far as I've = noticed. And this seems to be a big data portability fail. >> Surely the HMaster should be able to absorb stale metadata and wait = for new region-servers to check in. >> Instead it just keels over and dies. >> But before logging a case I wanted to know if there was something I'm = obviously missing or doing wrong. >>=20 >> The seeded test data is on HDFS. >>=20 >> Thoughts? >>=20 >>=20 >> [13/01/11 10:58:42] 5939 [ main] INFO = ion.service.HBaseRegionService - troove> Starting region server thread. >> [13/01/11 11:00:15] 98699 [ HMaster] FATAL = he.hadoop.hbase.master.HMaster - Unhandled exception. Starting = shutdown. >> java.net.SocketTimeoutException: 20000 millis timeout while waiting = for channel to be ready for connect. ch : = java.nio.channels.SocketChannel[connection-pending = remote=3D192.168.1.102/192.168.1.102:60020] >> at = org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java= :213) >> at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:404) >> at = org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseCli= ent.java:311) >> at = org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:865= ) >> at = org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:732) >> at = org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:258) >> at $Proxy15.getProtocolVersion(Unknown Source) >> at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:419) >> at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:393) >> at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:444) >> at = org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:349) >> at = org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementatio= n.getHRegionConnection(HConnectionManager.java:954) >> at = org.apache.hadoop.hbase.catalog.CatalogTracker.getCachedConnection(Catalog= Tracker.java:384) >> at = org.apache.hadoop.hbase.catalog.CatalogTracker.getMetaServerConnection(Cat= alogTracker.java:283) >> at = org.apache.hadoop.hbase.catalog.CatalogTracker.verifyMetaRegionLocation(Ca= talogTracker.java:478) >> at = org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:435)= >> at = org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:3= 82) >> at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:277) >> at java.lang.Thread.run(Thread.java:680) >>=20 >>=20 >> James Kennedy >> Troove Inc. >>=20 >>=20 --Apple-Mail-282-1043850496--