hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matteo Bertozzi <theo.berto...@gmail.com>
Subject Re: snapshot timeout problem
Date Mon, 21 Jul 2014 14:02:31 GMT
There are two timeout properties. one on the region server side and the
other one on master side (the coordinator).

"hbase.snapshot.master.timeoutMillis"
"hbase.snapshot.region.timeout"

increasing the master side only has no effect since the region server side
will send a timeout to the master after the default 60sec.


Matteo



On Mon, Jul 21, 2014 at 2:56 PM, Brian Jeltema <
brian.jeltema@digitalenvoy.net> wrote:

> There are 174 regions, not well balanced. One RegionServer has 69 regions.
> That RegionServer generates a
> series of log entries (modified and shown below), one for each region, at
> roughly 1 to 2 second intervals. The timeout period expires when
> it reaches region 36.
>
> 2014-07-21 07:49:44,503 regionserver.HRegion: Creating references for
> hfiles
> 2014-07-21 07:49:44,503 regionserver.HRegion: Adding snapshot references
> for [hdfs://
> xxx.digitalenvoy.net:8020/apps/hbase/data/data/default/hosts/31e2a098e9e311c4ddcfd3d8da28dfb6/p/3749b6df36c749508fe9c6f54ca425f2]
> hfiles
> 2014-07-21 07:49:44,503 regionserver.HRegion: Creating reference for file
> (1/1) : hdfs://
> xxx.digitalenvoy.net:8020/apps/hbase/data/data/default/hosts/31e2a098e9e311c4ddcfd3d8da28dfb6/p/3749b6df36c749508fe9c6f54ca425f2
> 2014-07-21 07:49:45,136 snapshot.FlushSnapshotSubprocedure: ... Flush
> Snapshotting region
> hosts,\x00\x03|\xBF!,1400600029600.31e2a098e9e311c4ddcfd3d8da28dfb6.
> completed.
> 2014-07-21 07:49:45,137 snapshot.FlushSnapshotSubprocedure: Closing region
> operation on
> hosts,\x00\x03|\xBF!,1400600029600.31e2a098e9e311c4ddcfd3d8da28dfb6.2014-07-21
> 07:49:45,137 DEBUG [rs(xxx.digitalenvoy.net,60020,1405943192177)-snapshot-pool3-thread-1]
> snapshot.FlushSnapshotSubprocedure: Starting region operation on
> hosts,\x00\x8A\x90\xD6\x08,1400
> 659179080.a74402fcbd9a96a7c92b250721095729.2014-07-21 07:49:45,137 DEBUG
> [member: ‘xxx.digitalenvoy.net,60020,1405943192177'
> subprocedure-pool1-thread-2] snapshot.RegionServerSnapshotManager:
> Completed 1/174 local region snapshots.
> 2014-07-21 07:49:45,137 snapshot.FlushSnapshotSubprocedure: Flush
> Snapshotting region
> hosts,\x00\x8A\x90\xD6\x08,1400659179080.a74402fcbd9a96a7c92b250721095729.
> started...
> 2014-07-21 07:49:45,137 regionserver.HRegion: Storing region-info for
> snapshot.
>
> On Jul 21, 2014, at 9:21 AM, Jean-Marc Spaggiari <jean-marc@spaggiari.org>
> wrote:
>
> > Can you also tell us more about your table? How many regions on how many
> > region servers?
> >
> >
> > 2014-07-21 8:23 GMT-04:00 Ted Yu <yuzhihong@gmail.com>:
> >
> >> Normally such timeout is caused by one region server which is slow in
> >> completing its part of the snapshot procedure.
> >>
> >> Have you looked at region server logs ?
> >> Feel free to pastebin relevant portion.
> >>
> >> Thanks
> >>
> >> On Jul 21, 2014, at 4:03 AM, Brian Jeltema <
> brian.jeltema@digitalenvoy.net>
> >> wrote:
> >>
> >>> I’m running HBase 0.98. I’m trying to snapshot a table, but it’s timing
> >> out after 60 seconds.
> >>> I increased the value of hbase.snapshot.master.timeoutMillis and
> >> restarted HBase,
> >>> but the timeout still happens after 60 seconds. Any suggestions?
> >>>
> >>> Brian
> >>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message