hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sterfield <sterfi...@gmail.com>
Subject Re: Hbase regionserver.MultiVersionConcurrencyControl Warning
Date Wed, 17 Aug 2016 08:38:58 GMT
>
> On Tue, Aug 16, 2016 at 9:27 AM, Sterfield <sterfield@gmail.com> wrote:
> > >
> >
> > ...
> > > > On the corresponding RS, at the same time, there's a message about a
> > big
> > > > flush, but not with so much memory in the memstore. Also, I don't see
> > any
> > > > warning that could explain why the memstore grew so large (nothing
> > about
> > > > the fact that there's too many hfiles to compact, for example)
> > > >
> > > >
> > > HBase keeps writing the memstore till it trips a lower limit. It then
> > moves
> > > to try and flush the region taking writes all the time until it hits an
> > > upper limit at which time it stops taking on writes until the flush
> > > completes to prevent running out of memory. If the RS is under load, it
> > may
> > > take a while to flush out the memstore causing us to hit the upper
> bound.
> > > Is that what is going on here?
> >
> >
> >  What I saw from the documentation + research on the Internet is :
> >
> >    - memstore is growing up to the memstore size limit (here 256MB)
> >    - It could flush earlier if :
> >       - the total amount of space taken by all the memstore hit the lower
> >       limit (0.35 by default), or the upper limit (0.40 by default). If
> it
> >       reaches upper limit, writes are blocked
> >
> Did you hit 'blocked' state in this case? (You can see it in the logs if it
> the case).


No. I checked the logs and there's nothing on that side.

>       - under "memory pressure", meaning that there's no more memory
> >       available
> >    - the "multiplier" allowed a memstore to grow up to a certain point
> >    (here x4), apparently in various cases :
> >       - when there's too many hfiles generated and that a compaction that
> >       must happened
> >       - apparently under lots of load.
> >
> >
> We will block too if too many storefiles. Again, logs should report if this
> is why it is holding up writes.
>  681   <property>
>  682     <name>hbase.hstore.blockingStoreFiles</name>
>  683     <value>10</value>
>  684     <description> If more than this number of StoreFiles exist in any
> one Store (one StoreFile
>  685      is written per flush of MemStore), updates are blocked for this
> region until a compaction is
>  686       completed, or until hbase.hstore.blockingWaitTime has been
> exceeded.</description>
>  687   </property>
> (I hate this config -- smile).


Nothing on that front too. At least no warning / errors in the log file.

>
> > Back to your question, yes, the server is being load-tested at the
> moment,
> > so I'm constantly writing 150k rows through OpenTSDB as we speak
> >
> >
> >
> Ok. Can you let the memstores run higher than 256M?


I could, but it begins to be quite big. 512MB for each memstore, it means
that with 20 regions of 10GB, it'll eat 10GB of RAM on my system.
Considering that I've assigned 27GB for each RS, it's almost to the 40%
upper limiit, for only 200GB of raw data in Hbase. Ok, the memstore could
not be full at the same time but in the worst case scenario, it could
happen, leading to preventive flushing. And the situation WILL get worse,
as I'll have more regions.

>
> > > > 2016-08-16 12:04:57,752 INFO  [MemStoreFlusher.0]
> regionserver.HRegion:
> > > > Finished memstore flush of ~821.25 MB/861146920, currentsize=226.67
> > > > MB/237676040 for region
> > > > tsdb,\x00\x03\xD9W\xAD\x82\x00\x00\x00\x01\x00\x00T\x00\
> > > > x00\x0A\x00\x008\x00\x00\x0B\x00\x009\x00\x00\x0C\x00\x005,
> > > 1471090649103.
> > > > b833cb8fdceff5cd21887aa9ff11e7bc.
> > > > in 13449ms, sequenceid=11332624, compaction requested=true
> > > >
> > > > So, what could explain this amount of memory taken by the memstore,
> and
> > > how
> > > > I could handle such situation ?
> > > >
> > > >
> > > You are taking on a lot of writes? The server is under load? Are lots
> of
> > > regions concurrently flushing? You could up flushing thread count from
> > > default 1. You could up the multiplier so more runway for the flush to
> > > complete within (x6 instead of x4), etc.
> >
> >
> > Thanks for the flushing thread count, never heard about that !
> > I could indeed increase the multiplier. I'll try that as well.
> >
> >
>  674   <property>
>  675     <name>hbase.hstore.flusher.count</name>
>  676     <value>2</value>
>  677     <description> The number of flush threads. With fewer threads, the
> MemStore flushes will be
>  678       queued. With more threads, the flushes will be executed in
> parallel, increasing the load on
>  679       HDFS, and potentially causing more compactions. </description>
>  680   </property>
> See if it helps. If you want to put up more log from a RS for us to look at
> w/ a bit of note on what configs you have, we can take a look.
> St.Ack


I bumped the multiplier to x6 and the flush thread count to 4. Still have
some issues (OpenTSDB complaining about RegionTooBusy).

I think that I'll bite the bullet and deactivate the compaction feature on
OpenTSDB. It'll generate more load for reads, but currently the read load
is quite low. It'll also eat more disk, but so be it.

Currently, each hour, for 20 mins, I have :

   - a reduction of the writes (not by much, but still)
   - much more reads, leading to additional read latency
   - lots of network in/out on all the RS, to gather all the rows and
   re-write them
   - Compaction queue sometimes is impacted.
   - Lots of CPU load (spikes to 10-12, for a 8CPU machine)
   - I suppose all this re-writing is leading to the poor data locality I'm
   currently having (50% of my regions have a locality < 0.5)

The current advantages are :

   - re-writing all the data in a single row is much simpler to read
   - having one row per metric per hour is also saving some disk space.

OK, the disk space can be heavily impacted by this, but this is manageable
by adding more nodes to the system (after all, that's what HDFS / Hbase is
designed for).

So IMHO, it's not worth it to keep this compaction feature, especially for
a very high throughput cluster.

Some information here [1]

Thanks

[1] :
https://groups.google.com/forum/#!searchin/opentsdb/compaction|sort:relevance/opentsdb/OCpmgLSPnRs/OO0FQnFXgtsJ

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message