hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Daniel Cryans <jdcry...@apache.org>
Subject Re: Problems with write performance (25kb rows)
Date Sun, 10 Jan 2010 17:07:11 GMT
You have this line:

2010-01-08 21:25:24,709 WARN org.apache.hadoop.hbase.util.Sleeper: We
slept 66413ms, ten times longer than scheduled: 3000

That's a garbage collector pause that lasted more than a minute which
is higher than the default timeout to consider a region server dead
(40 seconds in 0.20 unless you are using 0.20.3RC1). The master
replayed the write-ahead-logs and reopened the regions elsewhere.

You want to set a higher heap space in conf/hbase-env.sh because the
default 1GB is way too low, give it a much as you can without
swapping.

J-D

On Sat, Jan 9, 2010 at 4:06 AM, Dmitriy Lyfar <dlyfar@gmail.com> wrote:
> Hello,
>
> 2010/1/5 Jean-Daniel Cryans <jdcryans@apache.org>
>
>> WRT your last 2 emails, HBase ships with defaults that are working
>> safely for most of the users and in no way tuned for one time upload.
>> Playing with the memstore size like you did makes sense.
>>
>> Now you said you were inserting with row key being reversed ts... are
>> all threads using the same key space when uploading? I ask this
>> because if all 60 threads are hitting almost always the same region
>> (different one in time), then all 60 threads are just filling up
>> really fast the same memstore, then all wait for the snapshot,
>> eventually all wait for the same region split and in the mean time
>> fills the same WAL which will probably be rolled some times. Is it the
>> case?
>>
>> You could also post a region server log for us to analyze.
>>
>
> Now I'm using random int keys to distribute loading between regionservers.
> Now I not use threaded client, but multiprocessed one. And timings still
> almost same (sometimes random keys are faster).
> I left cluster for night stress testing. I've ran several clients, each of
> them inserts 100K of 25Kb records. I noticed that one of my regionservers
> were closed. I've analyzed logs and seems there were timeout with zookeeper
> service which caused closing of regionserver.
> Cluster continued its work, but test's timings were increased. I have few
> questions.
> Should I shutdown all cluster in such case to return closed regionserver to
> work?
> What master will do in such cases? Will it reassign regions to another
> servers? How it impacts on read/write performance?
> Logs of this regionserver is here: http://pastebin.com/m1c25e2ae
>
> --
> Thank you, Lyfar Dmitriy
>

Mime
View raw message