hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Neutron sharc <neutronsh...@gmail.com>
Subject Re: hbase 0.94.7 snapshot problem
Date Wed, 17 Jun 2015 17:48:17 GMT
An update in case somebody else also stumble on this issue.  The problem is
fixed by applying patch HBASE-8413:
https://issues.apache.org/jira/browse/HBASE-8413



On Sun, May 17, 2015 at 12:53 PM, lars hofhansl <larsh@apache.org> wrote:

> The latest version of 0.94 is 0.94.27. I doubt you'll get much help for
> 0.94.7 here (it's two years and 20! releases ago)
> Note that you can upgrade from 0.94.7 to 0.94.27 without down time (with a
> rolling upgrade), but you'll have to build it from source yourself.
>
> -- Lars
>       From: Neutron sharc <neutronsharc@gmail.com>
>  To: user@hbase.apache.org
>  Sent: Friday, May 15, 2015 3:40 PM
>  Subject: hbase 0.94.7 snapshot problem
>
> Hi HBase community,
>
> I'm seeing a problem with hbase snapshot with 0.94.7 (CDH 4.2.0)
>
> When I manually run "snapshot  <table name>,  <snapshot name>" to take a
> snapshot,  I keep getting error about "Failed taking snapshot {
> ss=ss_rich_pin_data_v1 table=rich_pin_data_v1 type=SKIPFLUSH } due to
> exception:No region directory found for region {xyz...}".
>
> I tried move around the region at problem, but another region will see same
> issue the next time.
>
> I tried a workaround (setting hbase.regionserver.ipc.address to 0.0.0.0)
>  suggested somewhere, but that doesn't work. (here is the link
>
> https://groups.google.com/a/cloudera.org/forum/#!topic/scm-users/B3fSsY6BgWI
> ).
>
>
> Below is an excerpt from master log:
>
> 2015-05-15 22:17:18,807 INFO
> org.apache.hadoop.hbase.master.snapshot.TakeSnapshotHandler: Running
> SKIPFLUSH table snapshot ss_rich_pin_data_v1 C_M_SNAPSHOT_TABLE on table
> rich_pin_data_v1
> 2015-05-15 22:17:19,308 INFO org.apache.hadoop.hbase.procedure.Procedure:
> Starting procedure 'ss_rich_pin_data_v1'
> 2015-05-15 22:17:54,346 ERROR org.apache.hadoop.hbase.procedure.Procedure:
> Procedure 'ss_rich_pin_data_v1' execution failed!
> org.apache.hadoop.hbase.errorhandling.ForeignException$ProxyThrowable via
> timer-java.util.Timer@14004920
> :org.apache.hadoop.hbase.errorhandling.ForeignException$ProxyThrowable:
> org.apache.hadoop.hbase.errorhandling.TimeoutException: Timeout elapsed!
> Source:Timeout caused Foreign Exception Start:1431728239316,
> End:1431728274317, diff:35001, max:35000 ms
> at
>
> org.apache.hadoop.hbase.errorhandling.ForeignExceptionDispatcher.rethrowException(ForeignExceptionDispatcher.java:85)
> at
>
> org.apache.hadoop.hbase.procedure.Procedure.waitForLatch(Procedure.java:369)
> at org.apache.hadoop.hbase.procedure.Procedure.call(Procedure.java:208)
> at org.apache.hadoop.hbase.procedure.Procedure.call(Procedure.java:68)
> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> at
>
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> Caused by:
> org.apache.hadoop.hbase.errorhandling.ForeignException$ProxyThrowable:
> org.apache.hadoop.hbase.errorhandling.TimeoutException: Timeout elapsed!
> Source:Timeout caused Foreign Exception Start:1431728239316,
> End:1431728274317, diff:35001, max:35000 ms
> at
>
> org.apache.hadoop.hbase.errorhandling.TimeoutExceptionInjector$1.run(TimeoutExceptionInjector.java:71)
> at java.util.TimerThread.mainLoop(Timer.java:512)
> at java.util.TimerThread.run(Timer.java:462)
> 2015-05-15 22:17:54,347 INFO
> org.apache.hadoop.hbase.procedure.ZKProcedureUtil: Clearing all znodes for
> procedure ss_rich_pin_data_v1including nodes
> /hbase/online-snapshot/acquired /hbase/online-snapshot/reached
> /hbase/online-snapshot/abort
> 2015-05-15 22:17:54,383 INFO
> org.apache.hadoop.hbase.master.snapshot.EnabledTableSnapshotHandler: Done
> waiting - snapshot for ss_rich_pin_data_v1 finished!
> 2015-05-15 22:17:54,841 ERROR
> org.apache.hadoop.hbase.master.snapshot.TakeSnapshotHandler: Failed taking
> snapshot { ss=ss_rich_pin_data_v1 table=rich_pin_data_v1 type=SKIPFLUSH }
> due to exception:No region directory found for region:{NAME =>
> 'rich_pin_data_v1,,1389326617112.081c4e6d88c46ff9be61b231b8ed2aca.',
> STARTKEY => '', ENDKEY => '0030a5c15b50587297a8fa0bd585a12b', ENCODED =>
> 081c4e6d88c46ff9be61b231b8ed2aca,}
> org.apache.hadoop.hbase.snapshot.CorruptedSnapshotException: No region
> directory found for region:{NAME =>
> 'rich_pin_data_v1,,1389326617112.081c4e6d88c46ff9be61b231b8ed2aca.',
> STARTKEY => '', ENDKEY => '0030a5c15b50587297a8fa0bd585a12b', ENCODED =>
> 081c4e6d88c46ff9be61b231b8ed2aca,}
> at
>
> org.apache.hadoop.hbase.master.snapshot.MasterSnapshotVerifier.verifyRegion(MasterSnapshotVerifier.java:167)
> at
>
> org.apache.hadoop.hbase.master.snapshot.MasterSnapshotVerifier.verifyRegions(MasterSnapshotVerifier.java:152)
> at
>
> org.apache.hadoop.hbase.master.snapshot.MasterSnapshotVerifier.verifySnapshot(MasterSnapshotVerifier.java:115)
> at
>
> org.apache.hadoop.hbase.master.snapshot.TakeSnapshotHandler.process(TakeSnapshotHandler.java:156)
> at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:175)
> at
>
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> 2015-05-15 22:17:54,841 INFO
> org.apache.hadoop.hbase.master.snapshot.TakeSnapshotHandler: Stop taking
> snapshot={ ss=ss_rich_pin_data_v1 table=rich_pin_data_v1 type=SKIPFLUSH }
> because: Failed to take snapshot '{ ss=ss_rich_pin_data_v1
> table=rich_pin_data_v1 type=SKIPFLUSH }' due to exception
>
>
>
> Appreciate any help!
>
>
>
>
> -Neutronsharc
>
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message