hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stack <st...@duboce.net>
Subject Re: Hbase regionserver.MultiVersionConcurrencyControl Warning
Date Fri, 12 Aug 2016 16:29:46 GMT
On Fri, Aug 12, 2016 at 1:22 AM, Sterfield <sterfield@gmail.com> wrote:

> ...
> I've reach OpenTSDB guys in order to know what's being done exactly when
> doing a compaction. With that information, it may be possible to tune Hbase
> in order to handle correctly the load. To me, it seems that there's a big
> scan, then lots of writes.
>
> I'll continue to work on this, but I'm accepting any comments / advice on
> this subject !
>
>
This is a well-known issue over in tsdb-land. IIRC, they are working on
alternative to the once-an-hour compression. See what they say over there
Guillaume.
Thanks,
St.Ack



> Thanks,
>
> Guillaume
>
> 2016-08-11 17:13 GMT+02:00 John Leach <jleach@splicemachine.com>:
>
> > We saw this as well at Splice Machine.  This led us to run compactions in
> > Spark.  Once we did this, we saw the compaction effects go away almost
> > entirely.
> >
> > Here is a link to our code.
> >
> > https://github.com/splicemachine/spliceengine/blob/
> > 73640a81972ef5831c1ea834ac9ac22f5b3428db/hbase_sql/src/main/
> > java/com/splicemachine/olap/CompactionJob.java <https://github.com/
> > splicemachine/spliceengine/blob/73640a81972ef5831c1ea834ac9ac2
> > 2f5b3428db/hbase_sql/src/main/java/com/splicemachine/olap/
> > CompactionJob.java>
> >
> > We have a todo to get this back in the community.
> >
> > Regards,
> > John Leach
> >
> > > On Aug 11, 2016, at 8:03 AM, Sterfield <sterfield@gmail.com> wrote:
> > >
> > > And it's gone [1]. No more spikes in the writes / read, no more
> OpenTSDB
> > > error. So I think it's safe to assume that OpenTSDB compaction is
> > > generating some additional load that is not very well handled by the
> > HBase,
> > > and therefore, generating the issues I'm mentioning.
> > >
> > > It seems also that the MVCC error are gone (to be checked).
> > >
> > > I don't know how to manage Hbase in order to make it possible to handle
> > > compaction without any issues, but at least, I know where it comes from
> > >
> > > [1] :
> > > https://www.dropbox.com/s/d6l2lngr6mpizh9/Without%
> > 20OpenTSDB%20compaction.png?dl=0
> > >
> > > 2016-08-11 13:18 GMT+02:00 Sterfield <sterfield@gmail.com>:
> > >
> > >> Hello,
> > >>
> > >>
> > >>>> Hi,
> > >>>>
> > >>>> Thanks for your answer.
> > >>>>
> > >>>> I'm currently testing OpenTSDB + HBase, so I'm generating thousands
> of
> > >>> HTTP
> > >>>> POST on OpenTSDB in order to write data points (currently up to
> > 300k/s).
> > >>>> OpenTSDB is only doing increment / append (AFAIK)
> > >>>>
> > >>>> How many nodes or is that 300k/s on a single machine?
> > >>
> > >>
> > >> 1 master node, 4 slaves, colo HDFS + RS.
> > >>
> > >> Master : m4.2xlarge (8CPU, 32GB RAM)
> > >> Slave : d2.2xlarge (8CPU, 61GB RAM, 6x2T disk)
> > >>
> > >>
> > >>>> If I have understood your answer correctly, some write ops are
> queued,
> > >>> and
> > >>>> some younger ops in the queue are "done" while some older are not.
> > >>>>
> > >>>>
> > >>> What Anoop said plus, we'll see the STUCK notice when it is taking
a
> > long
> > >>> time for the MVCC read point to come up to the write point of the
> > >>> currently
> > >>> ongoing transaction. We will hold the updating thread until the
> > readpoint
> > >>> is equal or greater than the current transactions write point. We do
> > this
> > >>> to ensure a client can read its own writes. The MVCC is region wide.
> If
> > >>> many ongoing updates, a slightly slower one may drag down other
> > >>> outstanding
> > >>> transactions completing. The STUCK message goes away after some time?
> > It
> > >>> happens frequently? A thread dump while this is going on would be
> > >>> interesting if possible or what else is going on on the server around
> > this
> > >>> time (see in logs?)
> > >>
> > >>
> > >> Yes, the STUCK message happens for quite some time (a dozen of
> minutes,
> > >> each hour.). It happens every hour.
> > >>
> > >>> Few additional questions :
> > >>>>
> > >>>>   - Is it a problem regarding the data or is it "safe" ? In other
> > >>> words,
> > >>>>   the old data not been written yet will be dropped or they will
be
> > >>>> written
> > >>>>   correctly, just later ?
> > >>>>
> > >>> No data is going to be dropped. The STUCK message is just flagging
> you
> > >>> that
> > >>> a write is taking a while to complete while we wait on MVCC. You
> > backed up
> > >>> on disk or another resource or a bunch of writers have all happened
> to
> > >>> arrive at one particular region (MVCC is by region)?
> > >>
> > >>
> > >> I've pre-splitted my "tsdb" region, to be managed by all 4 servers,
> so I
> > >> think I'm ok on this side. All information are stored locally on the
> EC2
> > >> disks.
> > >>
> > >>
> > >>>>   - How can I debug this and if possible, fix it ?
> > >>>>
> > >>>>
> > >>> See above. Your writes are well distributed. Disks are healthy?
> > >>
> > >>
> > >>
> > >> So here's the result of my investigation + my assumptions :
> > >>
> > >>
> > >>   - Every hour, my RS have a peak of Load / CPU. I was looking at a RS
> > >>   when it happened (that's easy, it's at the beginning of each hour),
> > and the
> > >>   RS java process was taking all the CPU available on the machine,
> > hence the
> > >>   load. You can see the load of all my servers on those images, see
> [1]
> > and
> > >>   [2].
> > >>   - Disk are fine IMO. Write IO is OK on average, peak to 300 / 400
> > >>   IOPS, in range of a correct mechanical drive. I don't see a
> > particular IOPS
> > >>   load at that time
> > >>   - However, you can see that every hour (see [3]) :
> > >>      - Calls are queued
> > >>      - Write are impacted (from 160k/s, down to 150 - 140k/s)
> > >>      - Read RPC are increased (I suppose that the RS is not answering,
> > >>      so they are queued and suddendly released in a batch)
> > >>      - GC count is spiking a bit
> > >>   - You can see that flush and compaction queues are fine
> > >>   - Also, the Heap usage is rather fine (I've put -Xmx27136m). At
> first,
> > >>   I was thinking that I was reaching the maximum Heap Size, but you
> can
> > see
> > >>   on the graph that 10.200.0.4 is far from the maximum.
> > >>   - OpenTSDB is also showing that it cannot write or compact rows.
> > >>   Here's an error :
> > >>
> > >> 10:19:02.415 ERROR [CompactionQueue.call] - Failed to write a row to
> > >> re-compact
> > >> org.hbase.async.RemoteException: org.apache.hadoop.hbase.
> > RegionTooBusyException:
> > >> Above memstore limit, regionName=tsdb,\x00\x07\xE4W\
> > >> xA9\xE1\xE0\x00\x00\x01\x00\x00\x86\x00\x00\x0B\x00\x004\
> > >> x00\x00\x0C\x00\x00I,1470758107028.5482c6
> > >> 65b5843179116be7643cf1fa67., server=ip-10-200-0-6.eu-west-
> > >> 1.campaign.aws,16020,1470888079169, memstoreSize=537335912,
> > >> blockingMemStoreSize=536870912
> > >>
> > >> Then a Java stack trace.
> > >>
> > >> I'm wondering if the issue is not because of OpenTSDB "compaction"
> [4].
> > I
> > >> don't fully understand what's being done when OpenTSDB is doing those
> > >> compactions, but it's done "when the base hour has passed", so it
> could
> > >> explains those hourly spikes. I'll try to deactivate this feature and
> > see
> > >> if those spikes are still there.
> > >>
> > >> [1] : https://www.dropbox.com/s/x8t56le3cxf3qci/Host%20detail%
> > >> 20%231.png?dl=0
> > >> [2] : https://www.dropbox.com/s/kmtg0dnedyphptr/Host%20detail%
> > >> 20%232.png?dl=0
> > >> [3] : https://www.dropbox.com/s/yb6uq08xb5u7v0a/GC.png?dl=0
> > >> [4] : http://opentsdb.net/docs/build/html/user_guide/
> > backends/hbase.html#
> > >> compactions
> > >>
> > >> 2016-08-10 19:08 GMT+02:00 Stack <stack@duboce.net>:
> > >>
> > >>> On Wed, Aug 10, 2016 at 1:13 AM, Sterfield <sterfield@gmail.com>
> > wrote:
> > >>>
> > >>>> Hi,
> > >>>>
> > >>>> Thanks for your answer.
> > >>>>
> > >>>> I'm currently testing OpenTSDB + HBase, so I'm generating thousands
> of
> > >>> HTTP
> > >>>> POST on OpenTSDB in order to write data points (currently up to
> > 300k/s).
> > >>>> OpenTSDB is only doing increment / append (AFAIK)
> > >>>>
> > >>>> How many nodes or is that 300k/s on a single machine?
> > >>>
> > >>>
> > >>>
> > >>
> > >>
> > >>
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>>   - If it's not a problem right now, and if the situation persists,
> > >>> can it
> > >>>>   be a problem later ? (for example, HBase cannot keep up the pace
> > then
> > >>>>   finally drop some data)
> > >>>>
> > >>>
> > >>> HBase will not drop data. It will not be a problem later. It is a
> > >>> WARN-level message saying writes are stalling some.
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>> I have to mention that OpenTSDB is using their own HBase client
> called
> > >>>> asynchbase (https://github.com/OpenTSDB/asynchbase). I don't know
> if
> > >>> it's
> > >>>> related or not, but just FYI.
> > >>>>
> > >>>>
> > >>> This is serverside internal to the Region. Nothing to do with
> > asynchbase
> > >>> (other than asynchbase tends to run with less friction so may be
> > writing
> > >>> faster than our default client).
> > >>>
> > >>> St.Ack
> > >>>
> > >>>
> > >>>
> > >>>> Thanks,
> > >>>>
> > >>>> Guillaume
> > >>>>
> > >>>> 2016-08-10 8:10 GMT+02:00 Anoop John <anoop.hbase@gmail.com>:
> > >>>>
> > >>>>> Ya it comes with write workload.  Not like with concurrent
reads.
> > >>>>> Once the write is done (memstore write and WAL write), we mark
that
> > >>>>> MVCC operation corresponding to this as complete and wait for
a
> > global
> > >>>>> read point to advance to atleast this point. (Every write op
will
> > have
> > >>>>> a number corresponding to its mvcc op)
> > >>>>> As the mvcc op begin, the write entries (which is having this
> number)
> > >>>>> added to a Q.  Once write is over and we complete the mvcc
op, we
> > will
> > >>>>> mark the entry corresponding to it (It is in Q now) as
> > completed/done.
> > >>>>>  So there are other entries also in the Q corresponding to
other
> > >>>>> parallel writes.   Now we will try to move the global read
point at
> > >>>>> least upto this mark.  Means we will have to make sue all the
write
> > >>>>> ops with lesser number than this to get completed.  Until then
,
> > there
> > >>>>> is a wait.   So what is happening here is a later write op
is
> > >>>>> completed and that is still waiting for an older write to get
> > >>>>> completed.   What sort of write ops u have?   Batch put and/or
> > >>>>> increment/append?
> > >>>>>
> > >>>>> -Anoop-
> > >>>>>
> > >>>>>
> > >>>>> On Tue, Aug 9, 2016 at 6:45 PM, Sterfield <sterfield@gmail.com>
> > >>> wrote:
> > >>>>>> 2016-08-05 17:52 GMT+02:00 Bryan Beaudreault <
> > >>> bbeaudreault@hubspot.com
> > >>>>> :
> > >>>>>>
> > >>>>>>> I'm also interested in an answer here. We see this
from time to
> > >>> time
> > >>>> in
> > >>>>> our
> > >>>>>>> production HBase clusters (non-opentsdb). It seems
to be related
> to
> > >>>>>>> contention under heavy reads or heavy writes. But it's
not clear
> > >>> what
> > >>>>> the
> > >>>>>>> impact is here.
> > >>>>>>>
> > >>>>>>> On Fri, Aug 5, 2016 at 5:14 AM Sterfield <sterfield@gmail.com>
> > >>> wrote:
> > >>>>>>>
> > >>>>>>>> Hi,
> > >>>>>>>>
> > >>>>>>>> I'm currently testing Hbase 1.2.1 + OpenTSDB. For
that, I'm
> > >>>>> generating a
> > >>>>>>>> high load of HTTP PUT on OpenTSDB, which then writes
in Hbase.
> > >>>>> Currently,
> > >>>>>>>> I'm able to feed 300k data point per seconds, and
I'm trying to
> > >>>>> achieve
> > >>>>>>>> higher speed.
> > >>>>>>>>
> > >>>>>>>> I have also activate JMX on both Master and Region
servers in
> > >>> order
> > >>>> to
> > >>>>>>> grab
> > >>>>>>>> a bunch of metrics from Hbase.
> > >>>>>>>>
> > >>>>>>>> At some point, I'm noticing a bunch of warnings
in a region
> > >>> server
> > >>>> log
> > >>>>>>> file
> > >>>>>>>> :
> > >>>>>>>>
> > >>>>>>>> 2016-08-04 05:12:06,537 WARN
> > >>>>>>>> [B.defaultRpcServer.handler=21,queue=3,port=16020]
> > >>>>>>>> regionserver.MultiVersionConcurrencyControl: STUCK:
> > >>>>>>>> MultiVersionConcurrencyControl{readPoint=7714719,
> > >>>> writePoint=7714778}
> > >>>>>>>> 2016-08-04 05:12:06,538 WARN
> > >>>>>>>> [B.defaultRpcServer.handler=19,queue=1,port=16020]
> > >>>>>>>> regionserver.MultiVersionConcurrencyControl: STUCK:
> > >>>>>>>> MultiVersionConcurrencyControl{readPoint=7714719,
> > >>>> writePoint=7714778}
> > >>>>>>>> 2016-08-04 05:12:06,538 WARN
> > >>>>>>>> [B.defaultRpcServer.handler=40,queue=4,port=16020]
> > >>>>>>>> regionserver.MultiVersionConcurrencyControl: STUCK:
> > >>>>>>>> MultiVersionConcurrencyControl{readPoint=7714719,
> > >>>> writePoint=7714778}
> > >>>>>>>> 2016-08-04 05:12:06,548 WARN
> > >>>>>>>> [B.defaultRpcServer.handler=0,queue=0,port=16020]
> > >>>>>>>> regionserver.MultiVersionConcurrencyControl: STUCK:
> > >>>>>>>> MultiVersionConcurrencyControl{readPoint=7714719,
> > >>>> writePoint=7714778}
> > >>>>>>>> 2016-08-04 05:12:06,549 WARN
> > >>>>>>>> [B.defaultRpcServer.handler=35,queue=5,port=16020]
> > >>>>>>>> regionserver.MultiVersionConcurrencyControl: STUCK:
> > >>>>>>>> MultiVersionConcurrencyControl{readPoint=7714719,
> > >>>> writePoint=7714778}
> > >>>>>>>> 2016-08-04 05:12:06,556 WARN
> > >>>>>>>> [B.defaultRpcServer.handler=7,queue=1,port=16020]
> > >>>>>>>> regionserver.MultiVersionConcurrencyControl: STUCK:
> > >>>>>>>> MultiVersionConcurrencyControl{readPoint=7714719,
> > >>>> writePoint=7714779}
> > >>>>>>>> 2016-08-04 05:12:06,558 WARN
> > >>>>>>>> [B.defaultRpcServer.handler=20,queue=2,port=16020]
> > >>>>>>>> regionserver.MultiVersionConcurrencyControl: STUCK:
> > >>>>>>>> MultiVersionConcurrencyControl{readPoint=7714719,
> > >>>> writePoint=7714779}
> > >>>>>>>> 2016-08-04 05:12:06,559 WARN
> > >>>>>>>> [B.defaultRpcServer.handler=14,queue=2,port=16020]
> > >>>>>>>> regionserver.MultiVersionConcurrencyControl: STUCK:
> > >>>>>>>> MultiVersionConcurrencyControl{readPoint=7714719,
> > >>>> writePoint=7714779}
> > >>>>>>>> 2016-08-04 05:12:06,609 WARN
> > >>>>>>>> [B.defaultRpcServer.handler=42,queue=0,port=16020]
> > >>>>>>>> regionserver.MultiVersionConcurrencyControl: STUCK:
> > >>>>>>>> MultiVersionConcurrencyControl{readPoint=7714719,
> > >>>> writePoint=7714779}
> > >>>>>>>> 2016-08-04 05:12:06,609 WARN
> > >>>>>>>> [B.defaultRpcServer.handler=32,queue=2,port=16020]
> > >>>>>>>> regionserver.MultiVersionConcurrencyControl: STUCK:
> > >>>>>>>> MultiVersionConcurrencyControl{readPoint=7714719,
> > >>>> writePoint=7714779}
> > >>>>>>>> 2016-08-04 05:12:06,619 WARN
> > >>>>>>>> [B.defaultRpcServer.handler=2,queue=2,port=16020]
> > >>>>>>>> regionserver.MultiVersionConcurrencyControl: STUCK:
> > >>>>>>>> MultiVersionConcurrencyControl{readPoint=7714719,
> > >>>> writePoint=7714779}
> > >>>>>>>> 2016-08-04 05:12:06,619 WARN
> > >>>>>>>> [B.defaultRpcServer.handler=48,queue=0,port=16020]
> > >>>>>>>> regionserver.MultiVersionConcurrencyControl: STUCK:
> > >>>>>>>> MultiVersionConcurrencyControl{readPoint=7714719,
> > >>>> writePoint=7714779}
> > >>>>>>>>
> > >>>>>>>> Can someone shed some lights on this ? I've googling
+ searching
> > >>> the
> > >>>>>>>> documentation, but found no clue on this subject.
> > >>>>>>>>
> > >>>>>>>> Is it a problem ? If so, how can I fix it ?
> > >>>>>>>>
> > >>>>>>>> Thanks a lot for your help,
> > >>>>>>>>
> > >>>>>>>> Guillaume
> > >>>>>>>>
> > >>>>>>>
> > >>>>>>
> > >>>>>> Hello,
> > >>>>>>
> > >>>>>> Sorry to bump this thread, but no one knows anything about
this
> > >>> issue ?
> > >>>>>>
> > >>>>>> Thanks,
> > >>>>>>
> > >>>>>> Guillaume
> > >>>>>
> > >>>>
> > >>>
> > >>
> > >>
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message