hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "James Kennedy (JIRA)" <j...@apache.org>
Subject [jira] Created: (HBASE-3445) Master crashes on when data moved to different host
Date Fri, 14 Jan 2011 19:14:46 GMT
Master crashes on when data moved to different host

                 Key: HBASE-3445
                 URL: https://issues.apache.org/jira/browse/HBASE-3445
             Project: HBase
          Issue Type: Bug
          Components: master
    Affects Versions: 0.90.0
            Reporter: James Kennedy
            Priority: Critical
             Fix For: 0.90.0

While testing an upgrade to 0.90.0 RC3 I noticed that if I seeded our test data on one machine
and transferred to another machine the HMaster on the new machine dies on startup.

Based on the following stack trace it looks as though it is attempting to find the .meta region
with the ip address of the original machine.  Instead of waiting around for RegionServer's
to register with new location data, HMaster throws it's hands up with a FATAL exception.

Note that deleting the zookeeper dir makes no difference.

Also note that so far I have only reproduced this in my own environment using the hbase-trx
extension of HBase and an ApplicationStarter that starts the Master and RegionServer together
in the same JVM.  While the issue seems likely isolated from those factors it is far from
a vanilla HBase environment.

I will spend some time trying to reproduce the issue in a proper hbase test.  But perhaps
someone can beat me to it?  How do I simulate the IP switch? May require a data.tar upload.

[14/01/11 10:45:20] 6396   [     Thread-298] ERROR server.quorum.QuorumPeerConfig  - Invalid
configuration, only one server specified (ignoring)
[14/01/11 10:45:21] 7178   [           main] INFO  ion.service.HBaseRegionService  - troove>
region port:       60010
[14/01/11 10:45:21] 7180   [           main] INFO  ion.service.HBaseRegionService  - troove>
region interface:  org.apache.hadoop.hbase.ipc.IndexedRegionInterface
[14/01/11 10:45:21] 7180   [           main] INFO  ion.service.HBaseRegionService  - troove>
root dir: hdfs://localhost:8701/hbase
[14/01/11 10:45:21] 7180   [           main] INFO  ion.service.HBaseRegionService  - troove>
Initializing region server.
[14/01/11 10:45:21] 7631   [           main] INFO  ion.service.HBaseRegionService  - troove>
Starting region server thread.
[14/01/11 10:46:54] 100764 [        HMaster] FATAL he.hadoop.hbase.master.HMaster  - Unhandled
exception. Starting shutdown.
java.net.SocketTimeoutException: 20000 millis timeout while waiting for channel to be ready
for connect. ch : java.nio.channels.SocketChannel[connection-pending remote=]
	at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:213)
	at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:404)
	at org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:311)
	at org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:865)
	at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:732)
	at org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:258)
	at $Proxy14.getProtocolVersion(Unknown Source)
	at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:419)
	at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:393)
	at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:444)
	at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:349)
	at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:954)
	at org.apache.hadoop.hbase.catalog.CatalogTracker.getCachedConnection(CatalogTracker.java:384)
	at org.apache.hadoop.hbase.catalog.CatalogTracker.getMetaServerConnection(CatalogTracker.java:283)
	at org.apache.hadoop.hbase.catalog.CatalogTracker.verifyMetaRegionLocation(CatalogTracker.java:478)
	at org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:435)
	at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:382)
	at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:277)

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message