hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Daniel Cryans <jdcry...@apache.org>
Subject Re: Reducing impact of compactions on read performance
Date Tue, 18 May 2010 18:06:40 GMT
> Resending this to hbase-user@hadoop.apache.org because my mail to
> user@hbase.apache.org failed with "550 550 mail to user@hbase.apache.org not
> accepted here (state 14)".  Is the reply-to getting set correctly?  Anyway,
> responses inline...

Yeah that's strange, I just saw it too. It's probably related to the
fact that Apache infra is moving our mailing lists since we are now a
top level project.

> Here is a region server log from yesterday: http://pastebin.com/5a04kZVj
> Every time one of those compactions ran (around 1pm, 4pm, 6pm, etc.) our
> read performance took a big hit.  BTW, is there a way I can tell by looking
> at the logs whether a minor or major compaction is running?  Yes, we do see
> lots of I/O wait (as high as 30-40% at times) when the compactions are
> running and reads are slow.  Load averages during compactions can spike as
> high as 60.

Yeah high IO wait will have a direct impact on read performance. Do
you swap? How much heap was given to the RSs?

I see that you're not running with DEBUG, only INFO, so we cannot see
which type of compaction is going on.

> OK, I'll set up a cron to kick majors off when load is at its lowest.  Can't
> hurt I suppose.

It's probably the best for the moment.

>> HBase limits the rate of inserts to not be overrun by WALs so that if
>> a machine fails, you don't have to split GBs of files. What about
>> inserting more slowly into your cluster? Flushes/compactions will be
>> more spread over time?
>> Disabling the WAL during your insert will make it a lot faster, not
>> necessarily what you want here.
> Our inserts are already fairly fast.  I think we usually get around
> 30,000/sec when we do these bulk imports.  I'm less concerned about insert
> speed and more concerned about the impact to reads when we do the bulk
> imports and a compaction is triggered.  Do you think it makes sense to
> disable WAL for the bulk inserts in this case?  Would disabling WAL decrease
> the number of compactions that are required?

This is my point, try uploading slower. Disabling WAL, like I said,
will speed up the upload since you don't write to WAL so compactions
will happen even at a faster rate!

> OK, I'm eagerly awaiting the next release.  Seems like there have been lots
> of good improvements since 0.20.3!

Lots of people working very hard :P

>> >
>> > Thanks,
>> > James
>> >

View raw message