hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Laurent H <laurent.hat...@gmail.com>
Subject Re: Compaction after bulk-load
Date Thu, 30 Jul 2015 15:17:36 GMT
I think it doesn't matter about number of region in your RS IF your key is
good one !
Maybe, check some documentation about number of HFile in each HRegion
(there is some stuff about HFile and minor compaction) and this property
can affect your write/read speed.

--
Laurent HATIER - Consultant Big Data & Business Intelligence chez CapGemini
fr.linkedin.com/pub/laurent-hatier/25/36b/a86/
<http://fr.linkedin.com/pub/laurent-h/25/36b/a86/>

2015-07-30 17:14 GMT+02:00 Laurent H <laurent.hatier@gmail.com>:

> yes, its one region = one reducer = one HFile generated
>
> --
> Laurent HATIER - Consultant Big Data & Business Intelligence chez CapGemini
> fr.linkedin.com/pub/laurent-hatier/25/36b/a86/
> <http://fr.linkedin.com/pub/laurent-h/25/36b/a86/>
>
> 2015-07-30 17:07 GMT+02:00 Krishna <research800@gmail.com>:
>
>> There are 10 region servers & I can schedule compaction during weekend
>> when
>> the write load negligable.
>>
>> After reading the documentation, its not clear how many HFiles are created
>> once bulk-load finishes - is it one HFile per reducer? My question is, is
>> it recommended to run major compaction after bulk-load if the # of regions
>> on each region server are not too high?
>>
>>
>> On Thursday, July 30, 2015, Ted Yu <yuzhihong@gmail.com> wrote:
>>
>> > How many region servers do you have in the cluster ?
>> >
>> > Would there be concurrent write load on the cluster if you choose to run
>> > major
>> > compaction ? I ask this because the concurrent write would be slowed
>> down
>> > by the major compaction and compacting 10 TB of data would take some
>> time.
>> >
>> > Cheers
>> >
>> > On Wed, Jul 29, 2015 at 4:23 PM, Krishna <research800@gmail.com
>> > <javascript:;>> wrote:
>> >
>> > > Hi,
>> > >
>> > > I am planning to bulk-load about 10 TB of data to a table pre-split
>> with
>> > > 30 regions with max region file size configured to 10 GB.
>> > >
>> > > Is it recommended that I run a major compaction when bulk-loading
>> > > finishes? How
>> > > many HFiles does the reducer create?
>> > >
>> >
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message