hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Laurent H <laurent.hat...@gmail.com>
Subject Re: Compaction after bulk-load
Date Thu, 30 Jul 2015 15:14:05 GMT
yes, its one region = one reducer = one HFile generated

--
Laurent HATIER - Consultant Big Data & Business Intelligence chez CapGemini
fr.linkedin.com/pub/laurent-hatier/25/36b/a86/
<http://fr.linkedin.com/pub/laurent-h/25/36b/a86/>

2015-07-30 17:07 GMT+02:00 Krishna <research800@gmail.com>:

> There are 10 region servers & I can schedule compaction during weekend when
> the write load negligable.
>
> After reading the documentation, its not clear how many HFiles are created
> once bulk-load finishes - is it one HFile per reducer? My question is, is
> it recommended to run major compaction after bulk-load if the # of regions
> on each region server are not too high?
>
>
> On Thursday, July 30, 2015, Ted Yu <yuzhihong@gmail.com> wrote:
>
> > How many region servers do you have in the cluster ?
> >
> > Would there be concurrent write load on the cluster if you choose to run
> > major
> > compaction ? I ask this because the concurrent write would be slowed down
> > by the major compaction and compacting 10 TB of data would take some
> time.
> >
> > Cheers
> >
> > On Wed, Jul 29, 2015 at 4:23 PM, Krishna <research800@gmail.com
> > <javascript:;>> wrote:
> >
> > > Hi,
> > >
> > > I am planning to bulk-load about 10 TB of data to a table pre-split
> with
> > > 30 regions with max region file size configured to 10 GB.
> > >
> > > Is it recommended that I run a major compaction when bulk-loading
> > > finishes? How
> > > many HFiles does the reducer create?
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message