hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Marc Spaggiari <jean-m...@spaggiari.org>
Subject Re: What is HBase compaction-queue-size at all?
Date Sat, 07 Dec 2013 13:24:06 GMT
> So I think it includes both running compactions and those in queue. Am I
missing  something?
Yes, that's correct. A major is just a compaction running on all the
regions. So a region server will count it like a compaction. But it can
also be a minor that the RS is seeing. So not necessary a major, but can be.


2013/12/2 Bharath Vissapragada <bharathv@cloudera.com>

> Hi,
>
>
> On Mon, Dec 2, 2013 at 8:07 PM, Jean-Marc Spaggiari <
> jean-marc@spaggiari.org
> > wrote:
>
> >    - Is it the *number of Store* of regionserver need to be major
> compacted
> >    ? or numbers of which is* being* compacted currently ?
> >
> >
> > This is the number that are currently in the pipe. Doesn.'t mean they are
> > compacting right now, but they are queued for compaction. and not
> necessary
> > major compaction. Major is only if all the regions need to compact.
> >
>
> Are you sure about this? I had a quick look at the code and this value is
> sum of sizes of queues largeCompactions and smallCompactions. The code
> doesn't keep track of whether they are running/in the queue. So I think it
> includes both running compactions and those in queue. Am I missing
> something?
>
>
> > "I was discovering that at some time it got *regionserver
> > compaction-queue-size = 4*.(I check it from Ambari). That's theoretically
> > impossible since I have only *one Store *to write(sequential key) at any
> > time, incurring only one major compaction is more reasonable."
> >
>
> Adding to what JMS said, compaction is a per region thing. If your write
> test creates multiple regions, there is a possibility that multiple
> compactions happen at the same time since they are queued.
>
>
> >
> > Why is this "impossible"? A store file is a dump of HBase memory blocks
> > written into the disk. Even if you write to a single region, single
> table,
> > with keys all close-by (even if it's all the same exact key). When the
> > block in memory reach a threshold, it's then written into the disk. When
> > more than x blocks (3 is the default) are there in disk, compaction is
> > launched.
> >
> >    - Just more confusing is : Isn't multi-thread enabled at earlier
> version
> >    that will  allocate each compaction job to a thread , by this reason
> why
> >    there exists compaction queue waiting for processing ?
> >
> > Yes, compaction is done on a separate thread, but there is one single
> > queue. You don't want to take 100% of you RS resources to do
> compactions...
> >
> > Depending if you are doing mostly writes and almost no reads, you might
> > want to tweek some parameters. And also, you might want to look into bulk
> > loading...
> >
> > Last, maybe you should review you key and distribution.
> >
> > And last again ;) What is your table definition? Multiplying the columns
> > famillies can also sometime lend to this kind of issues...
> >
> > JM
> >
> >
> >
> >
> > 2013/12/2 林煒清 <thesuperching@gmail.com>
> >
> > > Any one knows what compaction queue size is meant?
> > >
> > > By doc's definition:
> > >
> > > *9.2.5.* hbase.regionserver.compactionQueueSize Size of the compaction
> > > queue. This is the number of stores in the region that have been
> targeted
> > > for compaction.
> > >
> > >
> > >    - Is it the *number of Store* of regionserver need to be major
> > compacted
> > >    ? or numbers of which is* being* compacted currently ?
> > >
> > > I have a job writing data in a hotspot style using sequential key(non
> > > distributed) with 1 family so that 1 Store each region.
> > >
> > > I was discovering that at some time it got *regionserver
> > > compaction-queue-size = 4*.(I check it from Ambari). That's
> theoretically
> > > impossible since I have only *one Store *to write(sequential key) at
> any
> > > time, incurring only one major compaction is more reasonable.
> > >
> > >
> > >    - Then I dig into the logs ,found there is no thing about hints of
> > >     queue size > 0: Every major compaction just say *"This selection
> was
> > in
> > >    queue for 0sec", *I don't really understand what's it to means? is
> it
> > >    saying hbase has nothing in compaction queue?
> > >
> > > 013-11-26 12:28:00,778 INFO
> > > [regionserver60020-smallCompactions-1385440028938] regionserver.HStore:
> > > Completed major compaction of 3 file(s) in f1 of myTable.key.md5....
> into
> > > md5....(size=607.8 M), total size for store is 645.8 M.*This selection
> > was
> > > in queue for 0sec*, and took 39sec to execute.
> > >
> > >
> > >    - Just more confusing is : Isn't multi-thread enabled at earlier
> > version
> > >    that will  allocate each compaction job to a thread , by this reason
> > why
> > >    there exists compaction queue waiting for processing ?
> > >
> >
>
>
>
> --
> Bharath Vissapragada
> <http://www.cloudera.com>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message