hama-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Suraj Menon (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HAMA-498) BSPTask should periodically ping its parent.
Date Wed, 29 Feb 2012 11:47:59 GMT

    [ https://issues.apache.org/jira/browse/HAMA-498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13219126#comment-13219126
] 

Suraj Menon commented on HAMA-498:
----------------------------------

Oops! got caught for disabling rat check in my pom.xml. Same goes with my indifference to
warnings. Sorry :) 

There is a reason why I had the Future.get in the last test case and not in the first three.
I felt that the whole point of implementation was that there should be a minimum number of
pings coming if the task has run for a particular period of time. The problem where the test
case could fail is when the task takes too long to start. When such a case happens repeatedly,
then I think the test cases should fail and we should reconsider the leeway given to each
task to start. For the last test case, I had to ensure that I got the first ping, then kill
the RPC connection and then wait for the process to die for its exit status. I had a deterministic
sequence of events to wait for.

I had discussed the enigma of port 40000 with you.:) I thought I got over it. I am not running
BSPMaster for any of these test cases. I shall check and find a fix. This should be happening
only for the last test case where server closes the connection before proxy. I shall find
a fix.

I ran into some issues when I used the LocalBSPRunner.LocalSyncClient class. Shall look into
it.
                
> BSPTask should periodically ping its parent.
> --------------------------------------------
>
>                 Key: HAMA-498
>                 URL: https://issues.apache.org/jira/browse/HAMA-498
>             Project: Hama
>          Issue Type: Sub-task
>          Components: bsp
>    Affects Versions: 0.4.0
>            Reporter: Edward J. Yoon
>            Assignee: Suraj Menon
>              Labels: newbie
>             Fix For: 0.5.0
>
>         Attachments: HAMA-498.patch
>
>
> As described in http://wiki.apache.org/hama/GroomServerFaultTolerance
> BSPTask should periodically ping its parent 'GroomServer' for their health status.
> 1. If Tasks are unable to ping their parent 'GroomServer', it should be killed themselves.
> 2. And, if GroomServer does not receive ping from the childs, GroomServer should check
whether that child is running.
> You don't need to implement recovery logic in this issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message