hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brian Jeltema <brian.jelt...@digitalenvoy.net>
Subject Re: snapshot timeouts
Date Wed, 08 Oct 2014 19:25:35 GMT
Sorry, I usually include that info. HBase version is 0.98. hbase.rpc.timeout is the default.

When the ‘ERROR: Call id….’ occurred, there was no stack trace. That was the entire
error output.

Before I increased the snapshot timeout parameters, the timeout I was seeing looked like:

ERROR: org.apache.hadoop.hbase.snapshot.HBaseSnapshotException: Snapshot { ss=Host-bdj table=Host
type=FLUSH } had an error.  Procedure Host-bdj { waiting=[] done=[host-22.hdfs.foo.net,60020,1410543068459,
host-24.hdfs.foo.net,60020,1412603246174, host-17.hdfs.foo.net,60020,1410543059186, host-19.hdfs.foo.net,60020,1412419924491,
host-20.hdfs.foo.net,60020,1412419942143, host-16.hdfs.foo.net,60020,1403178964733, host-15.hdfs.foo.net,60020,1403178962029,
host-21.hdfs.foo.net,60020,1403178959748, host-23.hdfs.foo.net,60020,1410543079248, host-18.hdfs.foo.net,60020,1410543061865]
}
	at org.apache.hadoop.hbase.master.snapshot.SnapshotManager.isSnapshotDone(SnapshotManager.java:366)
	at org.apache.hadoop.hbase.master.HMaster.isSnapshotDone(HMaster.java:2993)
	at org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java:38245)
	at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2008)
	at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:92)
	at org.apache.hadoop.hbase.ipc.FifoRpcScheduler$1.run(FifoRpcScheduler.java:73)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:744)
Caused by: org.apache.hadoop.hbase.errorhandling.TimeoutException via timer-java.util.Timer@3097c4e1:org.apache.hadoop.hbase.errorhandling.TimeoutException:
Timeout elapsed! Source:Timeout caused Foreign Exception Start:1412792382137, End:1412792442137,
diff:60000, max:60000 ms
	at org.apache.hadoop.hbase.errorhandling.ForeignExceptionDispatcher.rethrowException(ForeignExceptionDispatcher.java:83)
	at org.apache.hadoop.hbase.master.snapshot.TakeSnapshotHandler.rethrowExceptionIfFailed(TakeSnapshotHandler.java:318)
	at org.apache.hadoop.hbase.master.snapshot.SnapshotManager.isSnapshotDone(SnapshotManager.java:356)
	... 10 more
Caused by: org.apache.hadoop.hbase.errorhandling.TimeoutException: Timeout elapsed! Source:Timeout
caused Foreign Exception Start:1412792382137, End:1412792442137, diff:60000, max:60000 ms
	at org.apache.hadoop.hbase.errorhandling.TimeoutExceptionInjector$1.run(TimeoutExceptionInjector.java:67)
	at java.util.TimerThread.mainLoop(Timer.java:555)
	at java.util.TimerThread.run(Timer.java:505)

On Oct 8, 2014, at 3:18 PM, Ted Yu <yuzhihong@gmail.com> wrote:

> Can you give a bit more information :
> 
> the release of hbase you're using
> value for hbase.rpc.timeout (looks like you leave it @ default)
> more of the error (please include stack trace if possible)
> 
> Cheers
> 
> On Wed, Oct 8, 2014 at 12:09 PM, Brian Jeltema <
> brian.jeltema@foo.net> wrote:
> 
>> I’m trying to snapshot a moderately large table (3 billion rows, but not a
>> huge amount of data per row).
>> Those snapshots have been timing out, so I set the following parameters to
>> relatively large values:
>> 
>>     hbase.snapshot.master.timeoutMillis
>>     hbase.snapshot.region.timeout
>>     hbase.snapshot.master.timeout.millis
>> 
>> A snapshot attempt then resulted in the terse result:
>> 
>>     ERROR: Call id=13, waitTime=60060, rpcTimeout=60000
>> 
>> A brief review of some of the hbase log files didn’t reveal anything (but
>> there are many).
>> How should I pursue getting these snapshots to work?
>> 
>> Brian


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message