hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Daniel Cryans <jdcry...@apache.org>
Subject Re: Cannot open filename Exceptions
Date Thu, 25 Mar 2010 16:32:12 GMT
4 CPUs seems ok, unless you are running 2-3 MR tasks at the same time.

So your value for the timeout is 240000, but did you change the tick
time? The GC pause you got seemed to last almost a minute which, if
you did not change the tick value, matches 3000*20 (disregard your
session timeout).

J-D

On Thu, Mar 25, 2010 at 1:07 AM, Zheng Lv <lvzheng19800619@gmail.com> wrote:
> Hello J-D,
>  Thank you for your reply first.
>  >How many CPUs do you have?
>  Every server has 2 Dual-Core cpus.
>  >Are you swapping?
>  Now I'm not sure about it with our monitor tools, but now we have written
> a script to record vmstat log every 2 seconds. If something wrong happen
> again, we can take it.
>  >Also if the only you are using this system currently to batch load
>  >data or as an analytics backend, you probably want to set the timeout
>  >higher:
>  But our value of this property is already 240000.
>
>  We will try to optimize our garbage collector and we will see what will
> happen.
>  Thanks again, J-D,
>    LvZheng
>
> 2010/3/25 Jean-Daniel Cryans <jdcryans@apache.org>
>
>> 2010-03-24 11:33:52,331 WARN org.apache.hadoop.hbase.util.Sleeper: We
>> slept 54963ms, ten times longer than scheduled: 3000
>>
>> You had an important garbage collector pause (aka pause of the world
>> in java-speak) and your region server's session with zookeeper expired
>> (it literally stopped responding for too long, so long it was
>> considered dead). Are you swapping? How many CPUs do you have? If you
>> are slowing down the garbage collecting process, it will take more
>> time.
>>
>> Also if the only you are using this system currently to batch load
>> data or as an analytics backend, you probably want to set the timeout
>> higher:
>>
>>  <property>
>>    <name>zookeeper.session.timeout</name>
>>    <value>60000</value>
>>    <description>ZooKeeper session timeout.
>>      HBase passes this to the zk quorum as suggested maximum time for a
>>      session.  See
>>
>> http://hadoop.apache.org/zookeeper/docs/current/zookeeperProgrammers.html#ch_zkSessions
>>      "The client sends a requested timeout, the server responds with the
>>      timeout that it can give the client. The current implementation
>>      requires that the timeout be a minimum of 2 times the tickTime
>>      (as set in the server configuration) and a maximum of 20 times
>>      the tickTime." Set the zk ticktime with
>> hbase.zookeeper.property.tickTime.
>>      In milliseconds.
>>    </description>
>>  </property>
>>
>> This value can only be 20 times bigger than this:
>>
>>  <property>
>>    <name>hbase.zookeeper.property.tickTime</name>
>>    <value>3000</value>
>>    <description>Property from ZooKeeper's config zoo.cfg.
>>    The number of milliseconds of each tick.  See
>>    zookeeper.session.timeout description.
>>    </description>
>>  </property>
>>
>>
>> So you could set tick to 6000, timeout to 120000 for a 2min timeout.
>>

Mime
View raw message