incubator-hama-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thomas Jungblut (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HAMA-498) BSPTask should periodically ping its parent.
Date Wed, 29 Feb 2012 11:53:57 GMT

    [ https://issues.apache.org/jira/browse/HAMA-498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13219127#comment-13219127
] 

Thomas Jungblut commented on HAMA-498:
--------------------------------------

Yeah I thought so that the sleep had a meaning^^

What about using the return value of the callable to determine how many seconds have passed
in the method via System.currentTimeMillis() and compare it with the number of pings we received
in the period. I guess this seems saver.

bq.I ran into some issues when I used the LocalBSPRunner.LocalSyncClient class. Shall look
into it.
Let me know what is wrong, but since you're not syncing there shouldn't be a problem.

bq.I had discussed the enigma of port 40000 with you.

Yeah, I take a deeper look into it later, but I guess because of the runtime exception the
server not closes the socket properly so it blocks during the next execution. Some time after
the socket receives a timeout so it frees itself. But that is not a really clean solution.

                
> BSPTask should periodically ping its parent.
> --------------------------------------------
>
>                 Key: HAMA-498
>                 URL: https://issues.apache.org/jira/browse/HAMA-498
>             Project: Hama
>          Issue Type: Sub-task
>          Components: bsp
>    Affects Versions: 0.4.0
>            Reporter: Edward J. Yoon
>            Assignee: Suraj Menon
>              Labels: newbie
>             Fix For: 0.5.0
>
>         Attachments: HAMA-498.patch
>
>
> As described in http://wiki.apache.org/hama/GroomServerFaultTolerance
> BSPTask should periodically ping its parent 'GroomServer' for their health status.
> 1. If Tasks are unable to ping their parent 'GroomServer', it should be killed themselves.
> 2. And, if GroomServer does not receive ping from the childs, GroomServer should check
whether that child is running.
> You don't need to implement recovery logic in this issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message