hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: Failed deleting my ephemeral node
Date Tue, 07 May 2013 16:29:30 GMT
Can you tell us a bit more about your zookeeper setup ?

Checking zookeeper log around 2013-04-16 14:31:24 would help, too.

Cheers

On Tue, May 7, 2013 at 6:05 AM, Fabien Chung <chung.fabien@gmail.com> wrote:

> Hi all,
>
> i have a cluster with 8 machines (CDH4). I use an ETL (Talend) to insert
> data into hbase. Mostof time that works perfectly, but sometimes rows are
> not inserted, and i don't have any clue about the reason of the failure. I
> have 0 errors on Talend. That usually happens when i delete the table in
> hbase and i recreate a new one from Talend.
>
> I think these logs are revelant :
> *
> *
> *2013-04-16 14:31:09,610 INFO org.apache.hadoop.ipc.HBaseServer: PRI IPC
> Server handler 5 on 60020: exiting*
> *2013-04-16 14:31:09,610 INFO org.apache.hadoop.ipc.HBaseServer: Stopping
> IPC Server Responder*
> *2013-04-16 14:31:09,610 INFO org.apache.hadoop.ipc.HBaseServer: PRI IPC
> Server handler 4 on 60020: exiting*
> *2013-04-16 14:31:09,610 INFO org.apache.hadoop.ipc.HBaseServer: REPL IPC
> Server handler 0 on 60020: exiting*
> *2013-04-16 14:31:09,610 INFO org.apache.hadoop.ipc.HBaseServer: PRI IPC
> Server handler 6 on 60020: exiting*
> *2013-04-16 14:31:09,609 INFO org.apache.hadoop.ipc.HBaseServer: PRI IPC
> Server handler 7 on 60020: exiting*
> *2013-04-16 14:31:09,609 INFO org.apache.hadoop.ipc.HBaseServer: PRI IPC
> Server handler 3 on 60020: exiting*
> *2013-04-16 14:31:09,609 INFO org.apache.hadoop.ipc.HBaseServer: PRI IPC
> Server handler 8 on 60020: exiting*
> *2013-04-16 14:31:09,609 INFO
> org.apache.hadoop.hbase.regionserver.SplitLogWorker: Sending interrupt to
> stop the worker thread*
> *2013-04-16 14:31:09,610 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer: Stopping infoServer*
> *2013-04-16 14:31:09,610 INFO
> org.apache.hadoop.hbase.regionserver.SplitLogWorker: SplitLogWorker
> interrupted while waiting for task, exiting:
> java.lang.InterruptedException*
> *2013-04-16 14:31:09,610 INFO
> org.apache.hadoop.hbase.regionserver.SplitLogWorker: SplitLogWorker
> NODE11.ysance.local,60020,1366110719610 exiting*
> *2013-04-16 14:31:09,611 INFO org.mortbay.log: Stopped
> SelectChannelConnector@0.0.0.0:60030*
> *2013-04-16 14:31:09,712 INFO
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher:
> regionserver60020.cacheFlusher exiting*
> *2013-04-16 14:31:09,712 INFO
> org.apache.hadoop.hbase.regionserver.LogRoller: LogRoller exiting.*
> *2013-04-16 14:31:09,712 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer$CompactionChecker:
> regionserver60020.compactionChecker exiting*
> *2013-04-16 14:31:09,712 INFO
> org.apache.hadoop.hbase.regionserver.snapshot.RegionServerSnapshotManager:
> Stopping RegionServerSnapshotManager gracefully.*
> *2013-04-16 14:31:09,727 INFO org.apache.zookeeper.ClientCnxn: EventThread
> shut down*
> *2013-04-16 14:31:09,727 INFO org.apache.zookeeper.ZooKeeper: Session:
> 0x13e128af3010001 closed*
> *2013-04-16 14:31:09,727 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer: stopping server
> NODE11.ysance.local,60020,1366110719610*
> *2013-04-16 14:31:09,728 INFO
> org.apache.hadoop.hbase.regionserver.snapshot.RegionServerSnapshotManager:
> Stopping RegionServerSnapshotManager gracefully.*
> *2013-04-16 14:31:09,728 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer: stopping server
> NODE11.ysance.local,60020,1366110719610; all regions closed.*
> *2013-04-16 14:31:09,728 INFO
> org.apache.hadoop.hbase.regionserver.wal.HLog: regionserver60020.logSyncer
> exiting*
> *2013-04-16 14:31:10,161 INFO org.apache.hadoop.hbase.regionserver.Leases:
> regionserver60020 closing leases*
> *2013-04-16 14:31:10,161 INFO org.apache.hadoop.hbase.regionserver.Leases:
> regionserver60020 closed leases*
> *2013-04-16 14:31:10,163 WARN
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient
> ZooKeeper exception:
> org.apache.zookeeper.KeeperException$SessionExpiredException:
> KeeperErrorCode = Session expired for
> /hbase/rs/NODE11.ysance.local,60020,1366110719610*
> *2013-04-16 14:31:10,163 INFO org.apache.hadoop.hbase.util.RetryCounter:
> Sleeping 2000ms before retry #1...*
> *2013-04-16 14:31:12,163 WARN
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient
> ZooKeeper exception:
> org.apache.zookeeper.KeeperException$SessionExpiredException:
> KeeperErrorCode = Session expired for
> /hbase/rs/NODE11.ysance.local,60020,1366110719610*
> *2013-04-16 14:31:12,163 INFO org.apache.hadoop.hbase.util.RetryCounter:
> Sleeping 4000ms before retry #2...*
> *2013-04-16 14:31:16,163 WARN
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient
> ZooKeeper exception:
> org.apache.zookeeper.KeeperException$SessionExpiredException:
> KeeperErrorCode = Session expired for
> /hbase/rs/NODE11.ysance.local,60020,1366110719610*
> *2013-04-16 14:31:16,163 INFO org.apache.hadoop.hbase.util.RetryCounter:
> Sleeping 8000ms before retry #3...*
> *2013-04-16 14:31:19,389 INFO org.apache.hadoop.hbase.regionserver.Leases:
> regionserver60020.leaseChecker closing leases*
> *2013-04-16 14:31:19,390 INFO org.apache.hadoop.hbase.regionserver.Leases:
> regionserver60020.leaseChecker closed leases*
> *2013-04-16 14:31:24,163 WARN
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient
> ZooKeeper exception:
> org.apache.zookeeper.KeeperException$SessionExpiredException:
> KeeperErrorCode = Session expired for
> /hbase/rs/NODE11.ysance.local,60020,1366110719610*
> *2013-04-16 14:31:24,163 ERROR
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: ZooKeeper delete
> failed after 3 retries*
> *2013-04-16 14:31:24,164 WARN
> org.apache.hadoop.hbase.regionserver.HRegionServer: Failed deleting my
> ephemeral node*
> *org.apache.zookeeper.KeeperException$SessionExpiredException:
> KeeperErrorCode = Session expired for
> /hbase/rs/NODE11.ysance.local,60020,1366110719610*
> *        at
> org.apache.zookeeper.KeeperException.create(KeeperException.java:127)*
> *        at
> org.apache.zookeeper.KeeperException.create(KeeperException.java:51)*
> *        at org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:873)*
> *        at
>
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.delete(RecoverableZooKeeper.java:137)
> *
> *        at
> org.apache.hadoop.hbase.zookeeper.ZKUtil.deleteNode(ZKUtil.java:1215)*
> *        at
> org.apache.hadoop.hbase.zookeeper.ZKUtil.deleteNode(ZKUtil.java:1204)*
> *        at
>
> org.apache.hadoop.hbase.regionserver.HRegionServer.deleteMyEphemeralNode(HRegionServer.java:1068)
> *
> *        at
>
> org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:846)
> *
> *        at java.lang.Thread.run(Thread.java:662)*
> *2013-04-16 14:31:24,165 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer: stopping server
> NODE11.ysance.local,60020,1366110719610; zookeeper connection closed.*
> *2013-04-16 14:31:24,165 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer: regionserver60020
> exiting*
> *2013-04-16 14:31:24,165 INFO
> org.apache.hadoop.hbase.regionserver.ShutdownHook: Starting fs shutdown
> hook thread.*
> *2013-04-16 14:31:24,166 INFO
> org.apache.hadoop.hbase.regionserver.ShutdownHook: Shutdown hook finished.*
>
>  In my mind, the issue comes from zookeeper/ regionserver but I can't
> really identify where exactly the problem is.
>
> Do you have any idea ?
>
> Regards
>
> --
> Chung Fabien
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message