hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From 茅旭峰 <m9s...@gmail.com>
Subject Re: How to control the size of WAL logs
Date Tue, 22 Mar 2011 04:54:03 GMT
Thanks J-D, currently our MAX_FILESIZE is 1GB.

How about the thousands of threads in the master while startup?

===
On the other hand, if there are too many files under /hbase/.logs, when I
was trying to restart the master, there are
over thousands of threads of class DataStreamer and ResponseProcessor, which
are trying to handle the hlogs.
Then quickly, the master turns to OOME, any way to control this situation?
===

Can I control the number of DataStreamer and ResponseProcessor?


On Tue, Mar 22, 2011 at 12:23 PM, Jean-Daniel Cryans <jdcryans@apache.org>wrote:

> There's not really anything in hbase preventing you from having that
> many regions, but usually for various reasons we try to keep it under
> a few hundreds. Especially in the bulk uploading case, it has a huge
> impact because of all the memstores a RS has to manage.
>
> You can set the size for splitting by setting MAX_FILESIZE on your
> table to at least 1GB (if you can give your region server a big heap
> like 8-10GB, then you can set those regions even bigger).
>
> J-D
>
> On Mon, Mar 21, 2011 at 7:59 PM, 茅旭峰 <m9suns@gmail.com> wrote:
> > Thanks, J-D.
> >
> > No, we are not using any compressor.
> >
> > We have limited node for regionservers, so each of them holds thousands
> of
> > regions, any guideline on this point?
> >
> > On Tue, Mar 22, 2011 at 10:30 AM, Jean-Daniel Cryans <
> jdcryans@apache.org>wrote:
> >
> >> HBase doesn't put a hard block on the number of hlogs like it does for
> >> memstore size or store files to compact, so it seems you are able to
> >> insert more data than you are flushing.
> >>
> >> Are you using GZ compression? This could be a cause for slow flushes.
> >>
> >> How many regions do you have per region server? Your log seems to
> >> indicate that you have a ton of them.
> >>
> >> J-D
> >>
> >> On Mon, Mar 21, 2011 at 7:23 PM, 茅旭峰 <m9suns@gmail.com> wrote:
> >> > Regarding hbase.regionserver.maxlogs,
> >> >
> >> > I've set it to 2, but it turns out the number of files under
> /hbase/.logs
> >> > stills keep increasing.
> >> > I see lots of logs like
> >> > ====
> >> > 2011-03-22 00:00:07,156 DEBUG
> >> > org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction
> >> > requested for
> >> >
> >>
> table1,sZD5CTBLUdV55xWWkmkI5rb1mJM=,1300587568567.8a84acf58dd3d684ccaa47d4fb4fd53a.
> >> > because regionserver60020.cacheFlusher; priority=-8, compaction queue
> >> > size=1755
> >> > 2011-03-22 00:00:07,183 INFO
> >> > org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush thread
> woke
> >> up
> >> > with memory above low water.
> >> > 2011-03-22 00:00:07,186 INFO
> >> > org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Under global
> heap
> >> > pressure: Region
> >> >
> >>
> table1,KGBJhl9RON29fT0hhak5-tBc-zs=,1300521656641.56e3a141164b546ae84d57e46a513922.
> >> > has too many store files, but is 6.2m vs best flushable region's 2.1m.
> >> > Choosing the bigger.
> >> > 2011-03-22 00:00:07,186 INFO
> >> > org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush of region
> >> >
> >>
> table1,KGBJhl9RON29fT0hhak5-tBc-zs=,1300521656641.56e3a141164b546ae84d57e46a513922.
> >> > due to global heap pressure
> >> > 2011-03-22 00:00:07,186 DEBUG
> >> org.apache.hadoop.hbase.regionserver.HRegion:
> >> > Started memstore flush for
> >> >
> >>
> table1,KGBJhl9RON29fT0hhak5-tBc-zs=,1300521656641.56e3a141164b546ae84d57e46a513922.,
> >> > current region memstore size 6.2m
> >> > 2011-03-22 00:00:07,201 INFO
> >> > org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter: Using
> >> syncFs
> >> > -- HDFS-200
> >> > 2011-03-22 00:00:07,241 INFO
> >> org.apache.hadoop.hbase.regionserver.wal.HLog:
> >> > Roll
> >> >
> /hbase/.logs/cloud138,60020,1300712706331/cloud138%3A60020.1300723196796,
> >> > entries=119, filesize=67903254. New hlog
> >> >
> /hbase/.logs/cloud138,60020,1300712706331/cloud138%3A60020.1300723207156
> >> > 2011-03-22 00:00:07,241 INFO
> >> org.apache.hadoop.hbase.regionserver.wal.HLog:
> >> > Too many hlogs: logs=398, maxlogs=2; forcing flush of 1 regions(s):
> >> > 334c81997502eb3c66c2bb9b47a87bcc
> >> > 2011-03-22 00:00:07,242 DEBUG
> >> org.apache.hadoop.hbase.regionserver.HRegion:
> >> > Finished snapshotting, commencing flushing stores
> >> > 2011-03-22 00:00:07,577 INFO
> org.apache.hadoop.hbase.regionserver.Store:
> >> > Renaming flushed file at
> >> >
> >>
> hdfs://cloud137:9000/hbase/table1/56e3a141164b546ae84d57e46a513922/.tmp/907665384208923152
> >> > to
> >> >
> >>
> hdfs://cloud137:9000/hbase/table1/56e3a141164b546ae84d57e46a513922/cfEStore/2298819588481793315
> >> > 2011-03-22 00:00:07,589 INFO
> org.apache.hadoop.hbase.regionserver.Store:
> >> > Added
> >> >
> >>
> hdfs://cloud137:9000/hbase/table1/56e3a141164b546ae84d57e46a513922/cfEStore/2298819588481793315,
> >> > entries=6, sequenceid=2229486, memsize=6.2m, filesize=6.2m
> >> > 2011-03-22 00:00:07,591 INFO
> >> org.apache.hadoop.hbase.regionserver.HRegion:
> >> > Finished memstore flush of ~6.2m for region
> >> >
> >>
> table1,KGBJhl9RON29fT0hhak5-tBc-zs=,1300521656641.56e3a141164b546ae84d57e46a513922.
> >> > in 405ms, sequenceid=2229486, compaction requested=true
> >> > ====
> >> >
> >> > Does this mean we have too many request for the regionsever to catch
> up
> >> with
> >> > the hlogs' increasement?
> >> >
> >> > On the other hand, if there are too many files under /hbase/.logs,
> when I
> >> > was trying to restart the master, there are
> >> > over thousands of threads of class DataStreamer and ResponseProcessor,
> >> which
> >> > are trying to handle the hlogs.
> >> > Then quickly, the master turns to OOME, any way to control this
> >> situation?
> >> >
> >> > On Fri, Mar 18, 2011 at 12:20 AM, Jean-Daniel Cryans <
> >> jdcryans@apache.org>wrote:
> >> >
> >> >> You can limit the number of WALs and their size on the region server
> by
> >> >> tuning:
> >> >>
> >> >> hbase.regionserver.maxlogs the default is 32
> >> >> hbase.regionserver.hlog.blocksize the default is whatever your HDFS
> >> >> blocksize times 0.95
> >> >>
> >> >> You can limit the number of parallel threads in the master by tuning:
> >> >>
> >> >> hbase.regionserver.hlog.splitlog.writer.threads the default is 3
> >> >> hbase.regionserver.hlog.splitlog.buffersize the default is
> 1024*1024*!28
> >> >>
> >> >> J-D
> >> >>
> >> >> On Wed, Mar 16, 2011 at 11:57 PM, 茅旭峰 <m9suns@gmail.com>
wrote:
> >> >> > Hi,
> >> >> >
> >> >> > In our tests, we've accumulated lots of WAL logs, in .logs, which
> >> leads
> >> >> to
> >> >> > quite long time pause or even
> >> >> > OOME when restarting either master or region server. We're doing
> sort
> >> of
> >> >> > bulk import and have not using
> >> >> > bulk import tricks, like turning off WAL feature. We think it's
> >> unknown
> >> >> how
> >> >> > our application really use the
> >> >> > hbase, so it is possible that users doing batch import unless
we're
> >> >> running
> >> >> > out of space. I wonder if there
> >> >> > is any property to set to control the size of WAL, would setting
> >> smaller
> >> >> > 'hbase.regionserver.logroll.period'
> >> >> > help?
> >> >> >
> >> >> > On the other hand, since we have lots of regions, the master is
> easy
> >> to
> >> >> run
> >> >> > into OOME, due to the occupied
> >> >> > memory by the instance of Assignment.regions. When we were trying
> to
> >> >> restart
> >> >> > the master, it always died
> >> >> > with OOME. I think, from the hprof file,  it is because the
> instance
> >> of
> >> >> > HLogSplitter$OutputSink holds too many
> >> >> > HLogSplitter$WriterAndPaths in logWriters, which even hold the
> buffer
> >> of
> >> >> > wal.SequenceFileLogWriter.
> >> >> > Is there any trick to avoid such kind of scenario?
> >> >> >
> >> >> > Thanks and regards,
> >> >> >
> >> >> > Mao Xu-Feng
> >> >> >
> >> >>
> >> >
> >>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message