hama-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Edward J. Yoon" <edwardy...@apache.org>
Subject Re: connecton loss exception
Date Wed, 16 Feb 2011 15:01:02 GMT
> I don't know what's going on but it works! Thread.sleep(100) helps !

Then, it's a mutex-related problem. We'll fix it soon. :)

Thanks!

On Wed, Feb 16, 2011 at 11:47 PM, Paweł Brach <braszek@gmail.com> wrote:
> I don't know what's going on but it works! Thread.sleep(100) helps !
>
> Thanks,
> Pawel
>
> 2011/2/16 Edward J. Yoon <edward@udanax.org>
>
>> Looks like problem of sync. Can you try again it after add
>> Thread.sleep(100); line?
>>
>> Sent from my iPhone
>>
>> On 2011. 2. 16., at 오후 3:24, Paweł Brach <braszek@gmail.com> wrote:
>>
>> > Yes, I have of course. My cluster has been configured and both examples
>> > PiEstimator and SerializePrinting work (there is communication between 3
>> > nodes). I've modified your example  - PiEstimator (put everything in the
>> > loop) and it works for few iterations (there is communication) and after
>> > that connection is lost. After that connection is re-established but some
>> > messages are missing. It looks like that Hama framework is very unstable
>> > when it's loaded and many messages are sending between nodes.
>> > On the same cluster I've configured Apache Hadoop and it's very stable.
>> > If you have own cluster configured, could you run my example on it ? Have
>> > you ever run something more complicated than PiEstimator and
>> > SerializePrinting on it ?
>> >
>> > Cheers,
>> > Pawel
>> >
>> > 2011/2/16 Chia-Hung Lin <clin4j@googlemail.com>
>> >
>> >> Have you configured zookeeper in hama-site.xml? Hama makes use of
>> >> zookeeper to do node communication IIRC.
>> >>
>> >>   Opening socket connection to server cl5/127.0.1.1:2181
>> >>
>> >> indicates that seems only localhost is up.  If this is the case, you
>> >> can change hama.zookeeper.quorum property pointing with value set to
>> >> e.g.
>> >>
>> >> <property>
>> >>   <name>hama.zookeeper.quorum</name>
>> >>   <value>node1,node2,node3,node4,node5</value>
>> >> </property>
>> >>
>> >> Hope it helps
>> >>
>> >> 2011/2/15 Paweł Brach <braszek@gmail.com>:
>> >>> Hello,
>> >>>
>> >>> During last few days I've tested Hama solutions and today I found some
>> >>> strange error in Hama framework. If you run a simple job with more than
>> >> few
>> >>> supersteps the following error occures:
>> >>>
>> >>> 2011-02-15 15:13:55,934 ERROR org.apache.hama.bsp.BSPPeer:
>> >>> 2011-02-15 15:13:56,525 INFO org.apache.zookeeper.ClientCnxn: Opening
>> >> socket
>> >>> connection to server cl5/127.0.1.1:2181
>> >>> 2011-02-15 15:13:56,526 WARN org.apache.zookeeper.ClientCnxn: Session
>> 0x0
>> >>> for server null, unexpected error, closing socket connection and
>> >> attempting
>> >>> reconnect
>> >>> java.net.ConnectException: Connection refused
>> >>>       at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>> >>>       at
>> >>> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
>> >>>       at
>> >>> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1078)
>> >>> 2011-02-15 15:13:56,626 ERROR org.apache.hama.bsp.BSPPeer:
>> >>> org.apache.zookeeper.KeeperException$ConnectionLossException:
>> >>> KeeperErrorCode = ConnectionLoss for /bsp
>> >>>
>> >>> You can reproduce that by running PiEstimator (the newest source code
>> >> from
>> >>> svn) with small changes - put whole body of the bsp() method in the
for
>> >>> loop. So add in the beginning following line:
>> >>>
>> >>> for (int j = 0; j < 100; j++) {
>> >>> // oryginal bsp() code
>> >>> }
>> >>>
>> >>> When I'm trying to run it, the framowork hangs and mentioned before
>> error
>> >>> occures.
>> >>>
>> >>> Your help will be appreciated.
>> >>>
>> >>> Cheers,
>> >>>
>> >>> --
>> >>> Pawel Brach
>> >>>
>> >>
>> >>
>> >>
>> >> --
>> >> ChiaHung Lin @ nuk, tw.
>> >>
>> >
>> >
>> >
>> > --
>> > Paweł Brach
>>
>
>
>
> --
> Paweł Brach
>



-- 
Best Regards, Edward J. Yoon
http://blog.udanax.org
http://twitter.com/eddieyoon

Mime
View raw message