hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stack <st...@duboce.net>
Subject Re: Issue with StoreFiles with bulk import.
Date Fri, 13 Aug 2010 15:15:40 GMT
Table flushCommit flushes the client-side buffering of writes.  That
is different from server flush of its memstore when they hit size
limits, the log message you are seeing below.  The two are not
directly related.



On Fri, Aug 13, 2010 at 6:47 AM, Jeremy Carroll
<jeremy.carroll@networkedinsights.com> wrote:
> I'm wondering if our code is calling tableCommit too frequently and flushing the regions
to disk when it does not have too. I'm seeing a lot of these entries.
>
> 2010-08-13 08:44:44,228 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Worker:
MSG_REGION_FLUSH: table,20100812 d92a4edf020575f5f7209e54981cc6b97b82d2b8,1281662636511.eea5d3079cb2d7c8637c725947e87428.
>
> Also for the job counters I see that in the map phase we called NUMBER_OF_TABLE_COMMITS
3,049 times. Does calling tableCommit flush the region to disk and create a new store file?
If so, I believe we can just move the tableCommit to the reduce phase instead of during the
map phase.
>

I'd say uses smallish buffer -- 2M -- and then just let hbase manage
flushing.  When map is done flushcommit should be called to clean out
anything left in client-side flush buffers.

St.Ack


>
> ________________________________________
> From: Jeremy Carroll [jeremy.carroll@networkedinsights.com]
> Sent: Friday, August 13, 2010 8:26 AM
> To: user@hbase.apache.org
> Subject: RE: Issue with StoreFiles with bulk import.
>
> The main issue is that the stores get very fragmented. I've seen as much as 300 storeFiles
for one region during this process. I'm concerned about performance with that many files to
search through. We are seeing this in the UI for a regions list (storeFiles=?), also we see
this through our JMX monitoring as well. If there is no issue with performance I'm OK. But
as it seems right now the only way I can decrease the storeFile count is do to a major compaction.
We only do one major compaction per day to minimize load on the system. So during the day
am I going to see drastically reduced performance with 30,000 storeFiles versus 800 when it's
major compacted? The 90 second timeout was the flush wait timeout. We are inserting pretty
fast so the flush timeout hits it's 90 second limit and aborts.
> ________________________________________
> From: saint.ack@gmail.com [saint.ack@gmail.com] On Behalf Of Stack [stack@duboce.net]
> Sent: Friday, August 13, 2010 12:33 AM
> To: user@hbase.apache.org
> Subject: Re: Issue with StoreFiles with bulk import.
>
> On Thu, Aug 12, 2010 at 1:13 PM, Jeremy Carroll
> <jeremy.carroll@networkedinsights.com> wrote:
>> I'm currently importing some files into HBase and am running into an problem with
a large number of store files being created.
>
> Where you see this Jeremy?  In the UI?  What kinda numbers are you seeing?
>
>
>> We have some back data which is stored in very large sequence files (3-5 Gb in size).
When we import this data the amount of stores created does not get out of hand.
>
>
> So when you mapreduce using these big files as source and insert into
> hbase, its not an issue?
>
>
>> When we switch to smaller sequence files being imported we see that the number of
stores rises quite dramatically.
>
>
> Why you need to change?
>
>
>> I do not know if this is happening because we are flushing the commits more frequently
with smaller files.
>
> Probably.  Have you tinkered with hbase default settings in any way?
>
> Perhaps you are getting better parallelism when lots of small files to
> chomp on?  More concurrent maps/clients?  So rate of upload goes up?
>
>
>> I'm wondering if anybody has any advice regarding this issue. My main concern is
during this process we do not finish flushing to disk (And we set WritetoWal False). We always
hit the 90 second timeout due to heavy write load. As these store files pile up, and they
do not get committed to disk, we run into issues where we could lose a lot of data if something
were to crash.
>>
>
>
> The 90 second timeout is the regionserver timing out against
> zookeeper?  Or is it something else?
>
> Storefiles are on the filesystem so what do you mean by the above fear
> of their not being committed to disk?
>
>
>> I have created screen shots of or monitoring application for HBase which shows the
spikes in activity.
>>
>> http://twitpic.com/photos/jeremy_carroll
>>
>
>
> Nice pictures.
>
> 30k storefiles is a good number.  They will go up as you are doing a
> bulk load as the compactor is probably overrun.   HBase will usually
> catch up though especially after the upload completes.
>
> Do you have compression enabled?
>
> I see regions growing steadily rather than spiking as the comment on
> the graph says.  500 regions ain't too many...
>
> How many servers in your cluster?
>
> St.Ack
>
>
>>
>>
>

Mime
View raw message