zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brian Takita <brian.tak...@gmail.com>
Subject Re: Timeout errors from the Ruby client library (get_children)
Date Sun, 22 May 2011 09:46:08 GMT
I have been debugging this issue. It seems like things are getting
stuck in the pthread_cond_wait call in the wait_sync_completion
method:

    wait_sync_completion
    zoo_wget_children2_
    zoo_wget_children2
    method_get_children

I opened a ticket in the twitter/zookeeper project on github.

https://github.com/twitter/zookeeper/issues/7

I isolated this issue in an ec2 instance. I can open up a screen
session to whoever wants to help out.

Thanks,
Brian

On Mon, May 16, 2011 at 12:09 PM, Brian Takita <brian.takita@gmail.com> wrote:
> Hello, I am attempting to call get_children on any node, however, I am
> receiving a timeout error. The rc is -7.
>
> The get command seems to work and recognize the correct number of children.
>
> When I run zkCli.sh on any of the zookeeper instances, ls / works.
>
> I am seeing the following log entries on one of the instances:
>
> 2011-05-16 18:58:36,924 - INFO
> [NIOServerCxn.Factory:2181:NIOServerCnxn@607] - Connected to
> /10.162.157.123:60135 lastZxid 0
> 2011-05-16 18:58:36,925 - INFO
> [NIOServerCxn.Factory:2181:NIOServerCnxn@992] - Finished init of
> 0x22fe336b7ca000d valid:true
> 2011-05-16 18:58:36,925 - INFO
> [NIOServerCxn.Factory:2181:NIOServerCnxn@636] - Renewing session
> 0x22fe336b7ca000d
> 2011-05-16 18:58:36,925 - WARN
> [NIOServerCxn.Factory:2181:NIOServerCnxn@518] - Exception causing
> close of session 0x22fe336b7ca000d due to java.io.IOException: Read
> error
> 2011-05-16 18:58:36,925 - INFO
> [NIOServerCxn.Factory:2181:NIOServerCnxn@857] - closing
> session:0x22fe336b7ca000d NIOServerCnxn:
> java.nio.channels.SocketChannel[connected local=/10.1  38842
> 62.151.144:2181 remote=/10.162.157.123:60135]
>
> I'm running 5 zookeeper servers on EC2. I can ping each instance from
> the other instances. Basically, they all belong to the same security
> group. I'm having a tough time knowing where to look for this Read
> error. Any hints?
>
> Thanks,
> Brian
>

Mime
View raw message