hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shahab Yunus <shahab.yu...@gmail.com>
Subject Re: Forcibly merging regions
Date Fri, 14 Nov 2014 17:05:31 GMT
Thanks again.

But I have been polling for a while and it still doesn't merge. I mean this
particular region example that I sent you, I am trying to merge it since
yesterday. I ran the polling-base code all night and I have to kill it.
Then in the morning, I tried manual merging through hbase shell and it
still doesn't merge. Note that the current polling logic doesnot try to
call merge again. It just checks the region size.

So how to clean it then? Or actually make it merge? Plus is this something
expected (a region keeping a reference)? How can we avoid it?

Note that this is not limited to this table only. We are seeing this in
other regions of other tables as well. Are we merging too fast?



Regards,
Shahab

On Fri, Nov 14, 2014 at 11:58 AM, Ted Yu <yuzhihong@gmail.com> wrote:

> Polling as you described is fine.
>
> catalogJanitor.cleanMergeQualifier() is called by
> DispatchMergingRegionHandler.
>
> If clean was successful, you would see the following:
>
>       LOG.debug("Deleting region " + regionA.getRegionNameAsString() + "
> and "
>
>           + regionB.getRegionNameAsString()
>
>           + " from fs because merged region no longer holds references");
>
> Assuming there was no log below in your master log:
>
>       LOG.error("Merged region " + region.getRegionNameAsString()
>
>           + " has only one merge qualifier in META.");
>
> It would be the case that 7373f75181c71eb5061a6673cee15931 still had
> reference file.
>
> Cheers
>
> On Fri, Nov 14, 2014 at 8:35 AM, Shahab Yunus <shahab.yunus@gmail.com>
> wrote:
>
> > Hi Ted.
> >
> > The log bit is below at the end of the email. This is the command to
> merge
> > that I gave just now through hbase shell. forcible was false but it
> behaves
> > similarly if forcible is true too. This is from master log. Indeed the
> > region merging was skipped! What does this mean? Data seems to be intact
> > for this table.
> >
> > Just to give you a background. This table was first merge by the auto
> mated
> > java application. What we are doing is that we are merging tables
> > programmatically. As the HBaseAdmin.mergeRegions calls i async, we poll
> for
> > the number of regions getting lowered after this merge call. The
> > application hangs and continues polling for ever as the previous merge
> > didn't happen.
> >
> > In this poll loop, we do get the number of regions by a fresh call to
> > HBaseAdmin.getTableRegions(tableName).getSize().
> >
> > What are these merge qualifiers and what are we doing wrong or should do?
> >
> > In the polling loop we can somehow retry merge again? But how can we
> know,
> > that we need to call merge again as it works for some regions. Is the
> table
> > meta corrupted for some reason by the above logic?
> >
> > Thanks a lot.
> >
> >
> >
> > ------------------------------------------------------------------------
> >
> > 2014-11-14 11:25:02,643 INFO org.apache.zookeeper.ZooKeeper: Session:
> > 0x348c7017707236b closed
> > 2014-11-14 11:25:02,643 INFO org.apache.zookeeper.ClientCnxn: EventThread
> > shut down
> > 2014-11-14 11:25:02,645 INFO org.apache.zookeeper.ZooKeeper: Initiating
> > client connection,
> >
> >
> connectString=ip-1010019.ec2.internal:2181,ip-1010017.ec2.internal:2181,ip-1010018.ec2.internal:2181
> > sessionTimeout=60000 watcher=catalogtracker-on-hconnection-0x47d865f2,
> >
> >
> quorum=ip-1010019.ec2.internal:2181,ip-1010017.ec2.internal:2181,ip-1010018.ec2.internal:2181,
> > baseZNode=/hbase
> > 2014-11-14 11:25:02,645 INFO
> > org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Process
> > identifier=catalogtracker-on-hconnection-0x47d865f2 connecting to
> ZooKeeper
> >
> >
> ensemble=ip-1010019.ec2.internal:2181,ip-1010017.ec2.internal:2181,ip-1010018.ec2.internal:2181
> > 2014-11-14 11:25:02,645 INFO org.apache.zookeeper.ClientCnxn: Opening
> > socket connection to server ip-1010018.ec2.internal/1010019:2181. Will
> not
> > attempt to authenticate using SASL (unknown error)
> > 2014-11-14 11:25:02,646 INFO org.apache.zookeeper.ClientCnxn: Socket
> > connection established to ip-1010018.ec2.internal/1010019:2181,
> initiating
> > session
> > 2014-11-14 11:25:02,648 INFO org.apache.zookeeper.ClientCnxn: Session
> > establishment complete on server ip-1010018.ec2.internal/1010019:2181,
> > sessionid = 0x348c7017707236c, negotiated timeout = 60000
> > 2014-11-14 11:25:02,703 INFO org.apache.zookeeper.ZooKeeper: Session:
> > 0x348c7017707236c closed
> > 2014-11-14 11:25:02,703 INFO org.apache.zookeeper.ClientCnxn: EventThread
> > shut down
> > 2014-11-14 11:25:30,713 INFO
> > org.apache.hadoop.hbase.master.handler.DispatchMergingRegionHandler: Skip
> > merging regions
> > TABLE_NAME,,1415915112497.7373f75181c71eb5061a6673cee15931.,
> >
> >
> TABLE_NAME,\x02\xFA\xF0\x80\x00\x00\x01I\xAA\xD5\x87\xA8\x19\x99\x99\x99\x99\x99\x99\x90,1415910559217.43f4d3685d113d3ae18eea9f189de096.,
> > because region 7373f75181c71eb5061a6673cee15931 has merge qualifier
> > 2014-11-14 11:25:41,383 INFO org.apache.zookeeper.ZooKeeper: Initiating
> > client connection,
> >
> >
> connectString=ip-1010019.ec2.internal:2181,ip-1010017.ec2.internal:2181,ip-1010018.ec2.internal:2181
> > sessionTimeout=60000 watcher=catalogtracker-on-hconnection-0x47d865f2,
> >
> >
> quorum=ip-1010019.ec2.internal:2181,ip-1010017.ec2.internal:2181,ip-1010018.ec2.internal:2181,
> > baseZNode=/hbase
> > 2014-11-14 11:25:41,384 INFO
> > org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Process
> > identifier=catalogtracker-on-hconnection-0x47d865f2 connecting to
> ZooKeeper
> >
> >
> ensemble=ip-1010019.ec2.internal:2181,ip-1010017.ec2.internal:2181,ip-1010018.ec2.internal:2181
> > 2014-11-14 11:25:41,384 INFO org.apache.zookeeper.ClientCnxn: Opening
> > socket connection to server ip-1010018.ec2.internal/1010019:2181. Will
> not
> > attempt to authenticate using SASL (unknown error)
> > 2014-11-14 11:25:41,386 INFO org.apache.zookeeper.ClientCnxn: Socket
> > connection established to ip-1010018.ec2.internal/1010019:2181,
> initiating
> > session
> > 2014-11-14 11:25:41,389 INFO org.apache.zookeeper.ClientCnxn: Session
> > establishment complete on server ip-1010018.ec2.internal/1010019:2181,
> > sessionid = 0x348c7017707236e, negotiated timeout = 60000
> > 2014-11-14 11:25:41,397 INFO org.apache.zookeeper.ZooKeeper: Session:
> > 0x348c7017707236e closed
> > 2014-11-14 11:25:41,398 INFO org.apache.zookeeper.ClientCnxn: EventThread
> > shut down
> >
> >
> ------------------------------------------------------------------------------------------------------------------------------------
> >
> > Regards,
> > Shahab
> >
> > On Fri, Nov 14, 2014 at 10:56 AM, Ted Yu <yuzhihong@gmail.com> wrote:
> >
> > > Looking at DispatchMergingRegionHandler, it does some check before
> > > initiating the merge.
> > > e.g.:
> > >
> > >       LOG.info("Skip merging regions " +
> region_a.getRegionNameAsString()
> > >
> > >           + ", " + region_b.getRegionNameAsString() + ", because
> region "
> > >
> > >           + (regionAHasMergeQualifier ? region_a.getEncodedName() :
> > > region_b
> > >
> > >               .getEncodedName()) + " has merge qualifier");
> > >
> > > Can you take a look at master log around the time merge request was
> > issued
> > > to see if you can get some clue ?
> > >
> > > Cheers
> > >
> > > On Fri, Nov 14, 2014 at 6:41 AM, Shahab Yunus <shahab.yunus@gmail.com>
> > > wrote:
> > >
> > > > The documentation of online merge tool (merge_region) states that if
> we
> > > > forcibly merge regions (by setting the 3rd attribute as true) then it
> > can
> > > > create overlapping regions. if this happens then will this render the
> > > > region or table unusable or it is just a performance hit? I mean how
> > > bigger
> > > > of a deal it is?
> > > >
> > > > Actually, we are merging regions using the programmatic API for this
> > and
> > > > setting this flag ('forcible') as false. But for some tables (we
> > haven't
> > > > figured out a pattern yet, data is still accessible), merge of
> regions
> > do
> > > > not happen at all. Afterwards we tried with this flag = true, and it
> > > still
> > > > doesn't merge them.
> > > >
> > > > CDH 5.1.0
> > > > (Hbase is 0.98.1-cdh5.1.0)
> > > >
> > > > Regards,
> > > > Shahab
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message