hadoop-zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Patrick Hunt <ph...@apache.org>
Subject Re: Bit of help debugging a TIMED OUT session please
Date Tue, 23 Feb 2010 22:21:08 GMT
Hard to say based on the bits/pieces of the log we have access to. I'd 
have to see the full log, preferably from both the server and client, to 
gain more insight.

re low numbers, this is the received count for the server, this should 
always increase never decrease. The fact that it is so low either 
indicates that the server recently restarted, or clients are not 
attaching to it. Seems like it should be near the other servers but 
again, hard to tell based on the small aperture we have via mail.

Patrick

Stack wrote:
> Thanks Patrick.  See below.
> 
> On Tue, Feb 23, 2010 at 1:19 PM, Patrick Hunt <phunt@apache.org> wrote:
>> Stack you might look at the following:
>>
>> 1) why does server 14 have such a low recv count?
>>
>>        Received: 194
>>
>> while the other servers are at 3.7k + received. Did server 14 fail at some
>> point? Or it's network? This may have caused the timeout seen by the client:
>>
> 
> Ok.  Will check into this the next time.  I did take the dump after
> the observed TIMED_OUT, a good while after.  Could this be why the
> numbers are low?
> 
>> ------snippet-----
>> 2010-02-21 18:23:55,583 [main-SendThread] INFO
>> org.apache.zookeeper.ClientCnxn: Attempting connection to server
>> 14.u.XXX.com/X.X.X.141:2181
>> 2010-02-21 18:24:00,423
>> [regionserver/208.76.44.140:60020.compactor-SendThread] WARN
>> org.apache.zookeeper.ClientCnxn: Exception closing session
>> 0x226ed968a270003 to sun.nio.ch.SelectionKeyImpl@2a50e9a3
>> java.io.IOException: TIMED OUT
>>        at
>> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:906)
>> -----------
>>
>> 2) connection timeout is different from session timeout. connection timeout
>> is the amount of time we allow for connection establishment (socket open)
>> until the server accepts the connection, this value is the session timeout
>> (as requested by the client) divided by the number of hosts in the host
>> list. This could account for why the timeout (above snippet) occurred after
>> 5 seconds. What timeout value is this client using? 15 seconds?
>>
> We ask for a session timeout of 60 seconds -- the hbase default -- and
> our ticktime is 3 seconds.
> 
> You are not troubled at all by the exceptions closing sessions above?
>  Are these just noise?
> 
> Thanks for the input,
> St.Ack

Mime
View raw message