zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michi Mutsuzaki <mi...@cs.stanford.edu>
Subject Re: Unable to read additional data from server sessionid...
Date Thu, 10 Apr 2014 03:47:34 GMT
Hi Suijian,

The client log says that the client failed to read from the underling
TCP socket. Maybe there was a network problem, or maybe the ZooKeeper
server the client was connected to died. It's difficult to say for
sure what happened without the server log though.

On Wed, Apr 9, 2014 at 3:04 PM, Suijian Zhou <suijian.zhou@gmail.com> wrote:
> Hi, Michi,
>   I could not find more logs about it as the zookeeper comes with another
> graph processing system( I did not install zookeeper seperatelly), the
> zookeeper log files of it are all empty. But do you know any possible
> reasons for this kind of errors? The server itself is running well all the
> time. But why the zookeeper session just got lost of connection so fast?
> Thanks!
>
>   Best Regards,
>   Suijian
>
>
> 2014-04-08 17:09 GMT-05:00 Michi Mutsuzaki <michi@cs.stanford.edu>:
>
>> Hi Suijian,
>>
>> Do you have the server-side log file?
>>
>> On Tue, Apr 8, 2014 at 3:05 PM, Suijian Zhou <suijian.zhou@gmail.com>
>> wrote:
>> > Hi,
>> >   I have a problem in zookeeper, after the session has been established,
>> > it
>> > will lose connection in ~1 minute although I see the timeout is set to
>> > 600000, i.e 10minutes. What's the possible reasons?
>> >
>> > 14/04/08 16:55:22 INFO mapred.JobClient: Running job:
>> > job_201404081444_0018
>> > 14/04/08 16:55:22 INFO zookeeper.ClientCnxn: Opening socket connection
>> > to
>> > server compute-0-13.local/10.1.255.241:22181. Will not attempt to
>> > authenticate using SASL (unknown error)
>> > 14/04/08 16:55:22 INFO zookeeper.ClientCnxn: Socket connection
>> > established
>> > to compute-0-13.local/10.1.255.241:22181, initiating session
>> > 14/04/08 16:55:22 INFO zookeeper.ClientCnxn: Session establishment
>> > complete
>> > on server compute-0-13.local/10.1.255.241:22181, sessionid =
>> > 0x14543567f5e0009, negotiated timeout = 600000
>> > ......
>> > ......
>> > 14/04/08 16:57:02 INFO job.JobProgressTracker: Data from 8 workers -
>> > Compute superstep 2: 0 out of 4847571 vertices computed; 0 out of 64
>> > partitions computed; min free memory on worker 2 - 216.01MB, average
>> > 287.75MB
>> > 14/04/08 16:57:07 INFO zookeeper.ClientCnxn: Unable to read additional
>> > data
>> > from server sessionid 0x14543567f5e0009, likely server has closed
>> > socket,
>> > closing socket connection and attempting reconnect
>> > 14/04/08 16:57:09 INFO zookeeper.ClientCnxn: Opening socket connection
>> > to
>> > server compute-0-13.local/10.1.255.241:22181. Will not attempt to
>> > authenticate using SASL (unknown error)
>> > 14/04/08 16:57:09 WARN zookeeper.ClientCnxn: Session 0x14543567f5e0009
>> > for
>> > server null, unexpected error, closing socket connection and attempting
>> > reconnect
>> > java.net.ConnectException: Connection refused
>> >
>> >   Best Regards,
>> >   Suijian
>
>

Mime
View raw message