hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Fabien Chung <chung.fab...@gmail.com>
Subject Failed deleting my ephemeral node
Date Tue, 07 May 2013 13:05:29 GMT
Hi all,

i have a cluster with 8 machines (CDH4). I use an ETL (Talend) to insert
data into hbase. Mostof time that works perfectly, but sometimes rows are
not inserted, and i don't have any clue about the reason of the failure. I
have 0 errors on Talend. That usually happens when i delete the table in
hbase and i recreate a new one from Talend.

I think these logs are revelant :
*
*
*2013-04-16 14:31:09,610 INFO org.apache.hadoop.ipc.HBaseServer: PRI IPC
Server handler 5 on 60020: exiting*
*2013-04-16 14:31:09,610 INFO org.apache.hadoop.ipc.HBaseServer: Stopping
IPC Server Responder*
*2013-04-16 14:31:09,610 INFO org.apache.hadoop.ipc.HBaseServer: PRI IPC
Server handler 4 on 60020: exiting*
*2013-04-16 14:31:09,610 INFO org.apache.hadoop.ipc.HBaseServer: REPL IPC
Server handler 0 on 60020: exiting*
*2013-04-16 14:31:09,610 INFO org.apache.hadoop.ipc.HBaseServer: PRI IPC
Server handler 6 on 60020: exiting*
*2013-04-16 14:31:09,609 INFO org.apache.hadoop.ipc.HBaseServer: PRI IPC
Server handler 7 on 60020: exiting*
*2013-04-16 14:31:09,609 INFO org.apache.hadoop.ipc.HBaseServer: PRI IPC
Server handler 3 on 60020: exiting*
*2013-04-16 14:31:09,609 INFO org.apache.hadoop.ipc.HBaseServer: PRI IPC
Server handler 8 on 60020: exiting*
*2013-04-16 14:31:09,609 INFO
org.apache.hadoop.hbase.regionserver.SplitLogWorker: Sending interrupt to
stop the worker thread*
*2013-04-16 14:31:09,610 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: Stopping infoServer*
*2013-04-16 14:31:09,610 INFO
org.apache.hadoop.hbase.regionserver.SplitLogWorker: SplitLogWorker
interrupted while waiting for task, exiting: java.lang.InterruptedException*
*2013-04-16 14:31:09,610 INFO
org.apache.hadoop.hbase.regionserver.SplitLogWorker: SplitLogWorker
NODE11.ysance.local,60020,1366110719610 exiting*
*2013-04-16 14:31:09,611 INFO org.mortbay.log: Stopped
SelectChannelConnector@0.0.0.0:60030*
*2013-04-16 14:31:09,712 INFO
org.apache.hadoop.hbase.regionserver.MemStoreFlusher:
regionserver60020.cacheFlusher exiting*
*2013-04-16 14:31:09,712 INFO
org.apache.hadoop.hbase.regionserver.LogRoller: LogRoller exiting.*
*2013-04-16 14:31:09,712 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer$CompactionChecker:
regionserver60020.compactionChecker exiting*
*2013-04-16 14:31:09,712 INFO
org.apache.hadoop.hbase.regionserver.snapshot.RegionServerSnapshotManager:
Stopping RegionServerSnapshotManager gracefully.*
*2013-04-16 14:31:09,727 INFO org.apache.zookeeper.ClientCnxn: EventThread
shut down*
*2013-04-16 14:31:09,727 INFO org.apache.zookeeper.ZooKeeper: Session:
0x13e128af3010001 closed*
*2013-04-16 14:31:09,727 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: stopping server
NODE11.ysance.local,60020,1366110719610*
*2013-04-16 14:31:09,728 INFO
org.apache.hadoop.hbase.regionserver.snapshot.RegionServerSnapshotManager:
Stopping RegionServerSnapshotManager gracefully.*
*2013-04-16 14:31:09,728 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: stopping server
NODE11.ysance.local,60020,1366110719610; all regions closed.*
*2013-04-16 14:31:09,728 INFO
org.apache.hadoop.hbase.regionserver.wal.HLog: regionserver60020.logSyncer
exiting*
*2013-04-16 14:31:10,161 INFO org.apache.hadoop.hbase.regionserver.Leases:
regionserver60020 closing leases*
*2013-04-16 14:31:10,161 INFO org.apache.hadoop.hbase.regionserver.Leases:
regionserver60020 closed leases*
*2013-04-16 14:31:10,163 WARN
org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient
ZooKeeper exception:
org.apache.zookeeper.KeeperException$SessionExpiredException:
KeeperErrorCode = Session expired for
/hbase/rs/NODE11.ysance.local,60020,1366110719610*
*2013-04-16 14:31:10,163 INFO org.apache.hadoop.hbase.util.RetryCounter:
Sleeping 2000ms before retry #1...*
*2013-04-16 14:31:12,163 WARN
org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient
ZooKeeper exception:
org.apache.zookeeper.KeeperException$SessionExpiredException:
KeeperErrorCode = Session expired for
/hbase/rs/NODE11.ysance.local,60020,1366110719610*
*2013-04-16 14:31:12,163 INFO org.apache.hadoop.hbase.util.RetryCounter:
Sleeping 4000ms before retry #2...*
*2013-04-16 14:31:16,163 WARN
org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient
ZooKeeper exception:
org.apache.zookeeper.KeeperException$SessionExpiredException:
KeeperErrorCode = Session expired for
/hbase/rs/NODE11.ysance.local,60020,1366110719610*
*2013-04-16 14:31:16,163 INFO org.apache.hadoop.hbase.util.RetryCounter:
Sleeping 8000ms before retry #3...*
*2013-04-16 14:31:19,389 INFO org.apache.hadoop.hbase.regionserver.Leases:
regionserver60020.leaseChecker closing leases*
*2013-04-16 14:31:19,390 INFO org.apache.hadoop.hbase.regionserver.Leases:
regionserver60020.leaseChecker closed leases*
*2013-04-16 14:31:24,163 WARN
org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient
ZooKeeper exception:
org.apache.zookeeper.KeeperException$SessionExpiredException:
KeeperErrorCode = Session expired for
/hbase/rs/NODE11.ysance.local,60020,1366110719610*
*2013-04-16 14:31:24,163 ERROR
org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: ZooKeeper delete
failed after 3 retries*
*2013-04-16 14:31:24,164 WARN
org.apache.hadoop.hbase.regionserver.HRegionServer: Failed deleting my
ephemeral node*
*org.apache.zookeeper.KeeperException$SessionExpiredException:
KeeperErrorCode = Session expired for
/hbase/rs/NODE11.ysance.local,60020,1366110719610*
*        at
org.apache.zookeeper.KeeperException.create(KeeperException.java:127)*
*        at
org.apache.zookeeper.KeeperException.create(KeeperException.java:51)*
*        at org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:873)*
*        at
org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.delete(RecoverableZooKeeper.java:137)
*
*        at
org.apache.hadoop.hbase.zookeeper.ZKUtil.deleteNode(ZKUtil.java:1215)*
*        at
org.apache.hadoop.hbase.zookeeper.ZKUtil.deleteNode(ZKUtil.java:1204)*
*        at
org.apache.hadoop.hbase.regionserver.HRegionServer.deleteMyEphemeralNode(HRegionServer.java:1068)
*
*        at
org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:846)
*
*        at java.lang.Thread.run(Thread.java:662)*
*2013-04-16 14:31:24,165 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: stopping server
NODE11.ysance.local,60020,1366110719610; zookeeper connection closed.*
*2013-04-16 14:31:24,165 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: regionserver60020
exiting*
*2013-04-16 14:31:24,165 INFO
org.apache.hadoop.hbase.regionserver.ShutdownHook: Starting fs shutdown
hook thread.*
*2013-04-16 14:31:24,166 INFO
org.apache.hadoop.hbase.regionserver.ShutdownHook: Shutdown hook finished.*

 In my mind, the issue comes from zookeeper/ regionserver but I can't
really identify where exactly the problem is.

Do you have any idea ?

Regards

-- 
Chung Fabien

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message