hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: Does 'online region merge' make regions unavailable for some time?
Date Fri, 23 Jan 2015 01:04:24 GMT
I did an experiment on a lightly populated table by merging second and
third regions which were on the same region server.

hbase(main):001:0> merge_region '41e99dcbb6e92a54925a4abc825c7dce',
'c53fbfe2dd70e7fb83d099aee3bf7758'
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
explanation.
0 row(s) in 2.4670 seconds

When the two regions are on different servers, one of them needs to be
moved (re-assigned) so that they're on the same region server. The duration
for that case would be longer.

Online merge first writes out a reference for each store file of the two
merging regions under the given directory. Then it moves the merges
temporary directory to the proper location in the filesystem. That's why
the whole process is fast.

Cheers

On Wed, Jan 21, 2015 at 7:58 PM, Otis Gospodnetic <
otis.gospodnetic@gmail.com> wrote:

> Hi,
>
> On Wed, Jan 21, 2015 at 10:26 PM, Ted Yu <yuzhihong@gmail.com> wrote:
>
> > bq. doesn't that mean that a completely new region needs to be written
> >
> > Yes, a new region (C in the pdf) would be created.
> >
> > bq. If regions are a few GB in size
> >
> > The data files from both regions are moved to the merged region's
> > directory.
> >
>
> Does that "move" mean local disk writes and no sending of data over the
> network?
> Even if there is no network involved and it's just writes, wouldn't moving
> a region that's a few GB in size take more than just a couple of seconds?
>
> Plus, there is that other thing I mentioned - merge(regionA,regionB) ==>
> write a whole new multi-GB region to (local) disk is also something that
> would take on the order of minutes, no?
>
>
> > bq. (in flight) writes or reads going to the regions that are being
> merged?
> >
> > The above operations have to wait for merged region to be assigned.
> >
>
> When you say "have to wait", do you mean:
> A) the application using HBase should not read and write data from regions
> being merged.  It should know it needs to wait with reads and writes until
> after regions are merged.
> OR
> B) HBase will block/buffer/delay reads and writes until the regions are
> merged and will then let them through
> ?
>
> Thanks,
> Otis
> --
> Monitoring * Alerting * Anomaly Detection * Centralized Log Management
> Solr & Elasticsearch Support * http://sematext.com/
>
>
> On Wed, Jan 21, 2015 at 7:09 PM, Otis Gospodnetic <
> > otis.gospodnetic@gmail.com> wrote:
> >
> > > Thanks Enis & Ted!
> > > A few more questions inline.
> > >
> > > On Wed, Jan 21, 2015 at 9:53 PM, Enis Söztutar <enis.soz@gmail.com>
> > wrote:
> > >
> > > > Online in this context is HBase cluster being online, not individual
> > > > regions. For the merge process, the regions go briefly offline
> similar
> > to
> > > > how splits work. It should be on the order of seconds.
> > > >
> > >
> > > Hm, but how could it be so quick?  Aren't regions first offlined and
> then
> > > one of them is *moved*?  Or maybe data is not actually sent over the
> > > network?
> > >
> > > But if 2 regions are being merged, doesn't that mean that a completely
> > new
> > > region needs to be written (over the network, to disk, and then HDFS
> > > replication also needs to take place).  If regions are a few GB in
> size,
> > > can that really be done in a matter of seconds?
> > >
> > > What happens to the (in flight) writes or reads going to the regions
> that
> > > are being merged?
> > >
> > > Thanks,
> > > Otis
> > > --
> > > Monitoring * Alerting * Anomaly Detection * Centralized Log Management
> > > Solr & Elasticsearch Support * http://sematext.com/
> > >
> > >
> > >
> > > > On Wed, Jan 21, 2015 at 10:26 AM, Ted Yu <yuzhihong@gmail.com>
> wrote:
> > > >
> > > > > Please take a look at slides 5 and 6 in this file:
> > > > >
> > > > >
> > > >
> > >
> >
> https://issues.apache.org/jira/secure/attachment/12561887/merge%20region.pdf
> > > > >
> > > > > It is clear that the two regions to be merged are taken offline in
> > step
> > > > 1.
> > > > >
> > > > > Cheers
> > > > >
> > > > > On Tue, Jan 20, 2015 at 5:26 PM, Otis Gospodnetic <
> > > > > otis.gospodnetic@gmail.com> wrote:
> > > > >
> > > > > > Hi,
> > > > > >
> > > > > > Considering this is called the *online* region merge, I would
> > assume
> > > > > > regions being merged never go offline during the merge and both
> > > regions
> > > > > > being merged are available for reading and writing at all times,
> > even
> > > > > > during the merge.... though I don't get how writes would work
if
> > one
> > > > > region
> > > > > > is being moved from one RS to another.... so maybe this is not
> > truly
> > > > > online
> > > > > > and writes are either rejected or buffered/blocked until the
> region
> > > is
> > > > > > moved AND merged?  Anyone knows for sure?
> > > > > >
> > > > > > I see this in one of the comments:
> > > > > > Q: If one (or both) of the regions were receiving non-trivial
> load
> > > > prior
> > > > > to
> > > > > > this action, would client(s) be affected ?
> > > > > > A: Yes, region would be off services in a short time, it is
equal
> > > with
> > > > > > moving region, e.g balance a region
> > > > > >
> > > > > > Also took a look at the patch:
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://issues.apache.org/jira/secure/attachment/12574965/hbase-7403-trunkv33.patch
> > > > > >
> > > > > > And see:
> > > > > >
> > > > > > +    /**
> > > > > > +     * The merging region A has been taken out of the server's
> > > online
> > > > > > regions list.
> > > > > > +     */
> > > > > > +    OFFLINED_REGION_A,
> > > > > >
> > > > > >
> > > > > > ... and if you look for the word "offline" in the patch I think
> > it's
> > > > > > pretty clear that BOTH regions being merged do go offline at
some
> > > > > > point.  I guess it could be after the merge, too, not before....
> > > > > >
> > > > > > ... maybe others know?
> > > > > >
> > > > > >
> > > > > > Thanks,
> > > > > > Otis
> > > > > > --
> > > > > > Monitoring * Alerting * Anomaly Detection * Centralized Log
> > > Management
> > > > > > Solr & Elasticsearch Support * http://sematext.com/
> > > > > >
> > > > > >
> > > > > > On Mon, Jan 19, 2015 at 4:17 AM, Vladimir Tretyakov <
> > > > > > vladimir.tretyakov@sematext.com> wrote:
> > > > > >
> > > > > > > Hi, I have one question about 'online region merge' (
> > > > > > > https://issues.apache.org/jira/browse/HBASE-7403).
> > > > > > > How I've understood regions which will be passed to merge
> method
> > > will
> > > > > be
> > > > > > > unavailable for some time.
> > > > > > >
> > > > > > > That means:
> > > > > > > 1. Some data will be unavailable some time.
> > > > > > > 2. If client will try to write data to these regions it
will
> get
> > > > > > > exceptions.
> > > > > > >
> > > > > > > Are above sentences correct?
> > > > > > >
> > > > > > > Somebody can estimate time which 1 and 2 will be true?
Seconds,
> > > > minutes
> > > > > > or
> > > > > > > hours? Is there any way to avoid 1 and 2?
> > > > > > >
> > > > > > > I am asking because now we have problem during time with
number
> > of
> > > > > > regions
> > > > > > > (our key contains timestamp), count of regions growing
> constantly
> > > > > > > (splitting) and it become a cause of performance problem
with
> > time.
> > > > > > > For avoiding this effect we use 2 tables:
> > > > > > > 1. First table we use for writing and reading data.
> > > > > > > 2. Second we use only for reading data.
> > > > > > >
> > > > > > > After some time we truncate second table and rotate these
> tables
> > > > (first
> > > > > > > become second and second become first). That allow us control
> > count
> > > > of
> > > > > > > regions, but solution looks a bit ugly, I looked at 'online
> > region
> > > > > > merge',
> > > > > > > but we can't live with restrictions I've described in first
> part
> > of
> > > > > > > question.
> > > > > > >
> > > > > > > Can somebody help with answers?
> > > > > > >
> > > > > > > Thx, Vladimir Tretyakov.
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message