hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brian Jeltema <brian.jelt...@digitalenvoy.net>
Subject Re: snapshot timeout problem
Date Mon, 21 Jul 2014 13:56:21 GMT
There are 174 regions, not well balanced. One RegionServer has 69 regions. That RegionServer
generates a
series of log entries (modified and shown below), one for each region, at roughly 1 to 2 second
intervals. The timeout period expires when
it reaches region 36. 

2014-07-21 07:49:44,503 regionserver.HRegion: Creating references for hfiles
2014-07-21 07:49:44,503 regionserver.HRegion: Adding snapshot references for [hdfs://xxx.digitalenvoy.net:8020/apps/hbase/data/data/default/hosts/31e2a098e9e311c4ddcfd3d8da28dfb6/p/3749b6df36c749508fe9c6f54ca425f2]
hfiles
2014-07-21 07:49:44,503 regionserver.HRegion: Creating reference for file (1/1) : hdfs://xxx.digitalenvoy.net:8020/apps/hbase/data/data/default/hosts/31e2a098e9e311c4ddcfd3d8da28dfb6/p/3749b6df36c749508fe9c6f54ca425f2
2014-07-21 07:49:45,136 snapshot.FlushSnapshotSubprocedure: ... Flush Snapshotting region
hosts,\x00\x03|\xBF!,1400600029600.31e2a098e9e311c4ddcfd3d8da28dfb6. completed.
2014-07-21 07:49:45,137 snapshot.FlushSnapshotSubprocedure: Closing region operation on hosts,\x00\x03|\xBF!,1400600029600.31e2a098e9e311c4ddcfd3d8da28dfb6.2014-07-21
07:49:45,137 DEBUG [rs(xxx.digitalenvoy.net,60020,1405943192177)-snapshot-pool3-thread-1]
snapshot.FlushSnapshotSubprocedure: Starting region operation on hosts,\x00\x8A\x90\xD6\x08,1400
659179080.a74402fcbd9a96a7c92b250721095729.2014-07-21 07:49:45,137 DEBUG [member: ‘xxx.digitalenvoy.net,60020,1405943192177'
subprocedure-pool1-thread-2] snapshot.RegionServerSnapshotManager: Completed 1/174 local region
snapshots.
2014-07-21 07:49:45,137 snapshot.FlushSnapshotSubprocedure: Flush Snapshotting region hosts,\x00\x8A\x90\xD6\x08,1400659179080.a74402fcbd9a96a7c92b250721095729.
started...
2014-07-21 07:49:45,137 regionserver.HRegion: Storing region-info for snapshot.

On Jul 21, 2014, at 9:21 AM, Jean-Marc Spaggiari <jean-marc@spaggiari.org> wrote:

> Can you also tell us more about your table? How many regions on how many
> region servers?
> 
> 
> 2014-07-21 8:23 GMT-04:00 Ted Yu <yuzhihong@gmail.com>:
> 
>> Normally such timeout is caused by one region server which is slow in
>> completing its part of the snapshot procedure.
>> 
>> Have you looked at region server logs ?
>> Feel free to pastebin relevant portion.
>> 
>> Thanks
>> 
>> On Jul 21, 2014, at 4:03 AM, Brian Jeltema <brian.jeltema@digitalenvoy.net>
>> wrote:
>> 
>>> I’m running HBase 0.98. I’m trying to snapshot a table, but it’s timing
>> out after 60 seconds.
>>> I increased the value of hbase.snapshot.master.timeoutMillis and
>> restarted HBase,
>>> but the timeout still happens after 60 seconds. Any suggestions?
>>> 
>>> Brian
>> 


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message