hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Otis Gospodnetic <otis.gospodne...@gmail.com>
Subject Re: Does 'online region merge' make regions unavailable for some time?
Date Thu, 29 Jan 2015 04:08:49 GMT
Thanks everyone for the feedback.  What we ended up implementing is
something that does these merges programmatically, but limits merges to
regions that live on the same RS.  They take on the order of just a few
seconds for us.

Otis
--
Monitoring * Alerting * Anomaly Detection * Centralized Log Management
Solr & Elasticsearch Support * http://sematext.com/


On Thu, Jan 22, 2015 at 8:04 PM, Ted Yu <yuzhihong@gmail.com> wrote:

> I did an experiment on a lightly populated table by merging second and
> third regions which were on the same region server.
>
> hbase(main):001:0> merge_region '41e99dcbb6e92a54925a4abc825c7dce',
> 'c53fbfe2dd70e7fb83d099aee3bf7758'
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
> explanation.
> 0 row(s) in 2.4670 seconds
>
> When the two regions are on different servers, one of them needs to be
> moved (re-assigned) so that they're on the same region server. The duration
> for that case would be longer.
>
> Online merge first writes out a reference for each store file of the two
> merging regions under the given directory. Then it moves the merges
> temporary directory to the proper location in the filesystem. That's why
> the whole process is fast.
>
> Cheers
>
> On Wed, Jan 21, 2015 at 7:58 PM, Otis Gospodnetic <
> otis.gospodnetic@gmail.com> wrote:
>
> > Hi,
> >
> > On Wed, Jan 21, 2015 at 10:26 PM, Ted Yu <yuzhihong@gmail.com> wrote:
> >
> > > bq. doesn't that mean that a completely new region needs to be written
> > >
> > > Yes, a new region (C in the pdf) would be created.
> > >
> > > bq. If regions are a few GB in size
> > >
> > > The data files from both regions are moved to the merged region's
> > > directory.
> > >
> >
> > Does that "move" mean local disk writes and no sending of data over the
> > network?
> > Even if there is no network involved and it's just writes, wouldn't
> moving
> > a region that's a few GB in size take more than just a couple of seconds?
> >
> > Plus, there is that other thing I mentioned - merge(regionA,regionB) ==>
> > write a whole new multi-GB region to (local) disk is also something that
> > would take on the order of minutes, no?
> >
> >
> > > bq. (in flight) writes or reads going to the regions that are being
> > merged?
> > >
> > > The above operations have to wait for merged region to be assigned.
> > >
> >
> > When you say "have to wait", do you mean:
> > A) the application using HBase should not read and write data from
> regions
> > being merged.  It should know it needs to wait with reads and writes
> until
> > after regions are merged.
> > OR
> > B) HBase will block/buffer/delay reads and writes until the regions are
> > merged and will then let them through
> > ?
> >
> > Thanks,
> > Otis
> > --
> > Monitoring * Alerting * Anomaly Detection * Centralized Log Management
> > Solr & Elasticsearch Support * http://sematext.com/
> >
> >
> > On Wed, Jan 21, 2015 at 7:09 PM, Otis Gospodnetic <
> > > otis.gospodnetic@gmail.com> wrote:
> > >
> > > > Thanks Enis & Ted!
> > > > A few more questions inline.
> > > >
> > > > On Wed, Jan 21, 2015 at 9:53 PM, Enis Söztutar <enis.soz@gmail.com>
> > > wrote:
> > > >
> > > > > Online in this context is HBase cluster being online, not
> individual
> > > > > regions. For the merge process, the regions go briefly offline
> > similar
> > > to
> > > > > how splits work. It should be on the order of seconds.
> > > > >
> > > >
> > > > Hm, but how could it be so quick?  Aren't regions first offlined and
> > then
> > > > one of them is *moved*?  Or maybe data is not actually sent over the
> > > > network?
> > > >
> > > > But if 2 regions are being merged, doesn't that mean that a
> completely
> > > new
> > > > region needs to be written (over the network, to disk, and then HDFS
> > > > replication also needs to take place).  If regions are a few GB in
> > size,
> > > > can that really be done in a matter of seconds?
> > > >
> > > > What happens to the (in flight) writes or reads going to the regions
> > that
> > > > are being merged?
> > > >
> > > > Thanks,
> > > > Otis
> > > > --
> > > > Monitoring * Alerting * Anomaly Detection * Centralized Log
> Management
> > > > Solr & Elasticsearch Support * http://sematext.com/
> > > >
> > > >
> > > >
> > > > > On Wed, Jan 21, 2015 at 10:26 AM, Ted Yu <yuzhihong@gmail.com>
> > wrote:
> > > > >
> > > > > > Please take a look at slides 5 and 6 in this file:
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://issues.apache.org/jira/secure/attachment/12561887/merge%20region.pdf
> > > > > >
> > > > > > It is clear that the two regions to be merged are taken offline
> in
> > > step
> > > > > 1.
> > > > > >
> > > > > > Cheers
> > > > > >
> > > > > > On Tue, Jan 20, 2015 at 5:26 PM, Otis Gospodnetic <
> > > > > > otis.gospodnetic@gmail.com> wrote:
> > > > > >
> > > > > > > Hi,
> > > > > > >
> > > > > > > Considering this is called the *online* region merge, I
would
> > > assume
> > > > > > > regions being merged never go offline during the merge
and both
> > > > regions
> > > > > > > being merged are available for reading and writing at all
> times,
> > > even
> > > > > > > during the merge.... though I don't get how writes would
work
> if
> > > one
> > > > > > region
> > > > > > > is being moved from one RS to another.... so maybe this
is not
> > > truly
> > > > > > online
> > > > > > > and writes are either rejected or buffered/blocked until
the
> > region
> > > > is
> > > > > > > moved AND merged?  Anyone knows for sure?
> > > > > > >
> > > > > > > I see this in one of the comments:
> > > > > > > Q: If one (or both) of the regions were receiving non-trivial
> > load
> > > > > prior
> > > > > > to
> > > > > > > this action, would client(s) be affected ?
> > > > > > > A: Yes, region would be off services in a short time, it
is
> equal
> > > > with
> > > > > > > moving region, e.g balance a region
> > > > > > >
> > > > > > > Also took a look at the patch:
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://issues.apache.org/jira/secure/attachment/12574965/hbase-7403-trunkv33.patch
> > > > > > >
> > > > > > > And see:
> > > > > > >
> > > > > > > +    /**
> > > > > > > +     * The merging region A has been taken out of the
server's
> > > > online
> > > > > > > regions list.
> > > > > > > +     */
> > > > > > > +    OFFLINED_REGION_A,
> > > > > > >
> > > > > > >
> > > > > > > ... and if you look for the word "offline" in the patch
I think
> > > it's
> > > > > > > pretty clear that BOTH regions being merged do go offline
at
> some
> > > > > > > point.  I guess it could be after the merge, too, not
> before....
> > > > > > >
> > > > > > > ... maybe others know?
> > > > > > >
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Otis
> > > > > > > --
> > > > > > > Monitoring * Alerting * Anomaly Detection * Centralized
Log
> > > > Management
> > > > > > > Solr & Elasticsearch Support * http://sematext.com/
> > > > > > >
> > > > > > >
> > > > > > > On Mon, Jan 19, 2015 at 4:17 AM, Vladimir Tretyakov <
> > > > > > > vladimir.tretyakov@sematext.com> wrote:
> > > > > > >
> > > > > > > > Hi, I have one question about 'online region merge'
(
> > > > > > > > https://issues.apache.org/jira/browse/HBASE-7403).
> > > > > > > > How I've understood regions which will be passed to
merge
> > method
> > > > will
> > > > > > be
> > > > > > > > unavailable for some time.
> > > > > > > >
> > > > > > > > That means:
> > > > > > > > 1. Some data will be unavailable some time.
> > > > > > > > 2. If client will try to write data to these regions
it will
> > get
> > > > > > > > exceptions.
> > > > > > > >
> > > > > > > > Are above sentences correct?
> > > > > > > >
> > > > > > > > Somebody can estimate time which 1 and 2 will be true?
> Seconds,
> > > > > minutes
> > > > > > > or
> > > > > > > > hours? Is there any way to avoid 1 and 2?
> > > > > > > >
> > > > > > > > I am asking because now we have problem during time
with
> number
> > > of
> > > > > > > regions
> > > > > > > > (our key contains timestamp), count of regions growing
> > constantly
> > > > > > > > (splitting) and it become a cause of performance problem
with
> > > time.
> > > > > > > > For avoiding this effect we use 2 tables:
> > > > > > > > 1. First table we use for writing and reading data.
> > > > > > > > 2. Second we use only for reading data.
> > > > > > > >
> > > > > > > > After some time we truncate second table and rotate
these
> > tables
> > > > > (first
> > > > > > > > become second and second become first). That allow
us control
> > > count
> > > > > of
> > > > > > > > regions, but solution looks a bit ugly, I looked at
'online
> > > region
> > > > > > > merge',
> > > > > > > > but we can't live with restrictions I've described
in first
> > part
> > > of
> > > > > > > > question.
> > > > > > > >
> > > > > > > > Can somebody help with answers?
> > > > > > > >
> > > > > > > > Thx, Vladimir Tretyakov.
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message