incubator-hama-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Suraj Menon (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HAMA-498) BSPTask should periodically ping its parent.
Date Fri, 24 Feb 2012 12:31:49 GMT

    [ https://issues.apache.org/jira/browse/HAMA-498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13215572#comment-13215572
] 

Suraj Menon commented on HAMA-498:
----------------------------------

While testing with fault injection in different points I found few issues. I am changing the
implementation of BSPTask to conform to the documentation of BSP. Failure in bsp function
skips cleanup function call.
Today's code
{noformat}
private final <KEYIN, VALUEIN, KEYOUT, VALUEOUT, M extends Writable> void runBSP(
      final BSPJob job, BSPPeerImpl<KEYIN, VALUEIN, KEYOUT, VALUEOUT, M> bspPeer,
      final BytesWritable rawSplit, final BSPPeerProtocol umbilical)
          throws IOException, SyncException, ClassNotFoundException,
          InterruptedException {

    BSP<KEYIN, VALUEIN, KEYOUT, VALUEOUT, M> bsp = (BSP<KEYIN, VALUEIN, KEYOUT, VALUEOUT,
M>) ReflectionUtils
        .newInstance(job.getConf().getClass("bsp.work.class", BSP.class),
            job.getConf());
    bsp.setup(bspPeer);
    bsp.bsp(bspPeer);
    bsp.cleanup(bspPeer);
    bspPeer.close();
}
    
{noformat}
Changed.
{noformat}
private final <KEYIN, VALUEIN, KEYOUT, VALUEOUT, M extends Writable> void runBSP(
      final BSPJob job, BSPPeerImpl<KEYIN, VALUEIN, KEYOUT, VALUEOUT, M> bspPeer,
      final BytesWritable rawSplit, final BSPPeerProtocol umbilical)
          throws IOException, SyncException, ClassNotFoundException,
          InterruptedException {

    BSP<KEYIN, VALUEIN, KEYOUT, VALUEOUT, M> bsp = (BSP<KEYIN, VALUEIN, KEYOUT, VALUEOUT,
M>) ReflectionUtils
        .newInstance(job.getConf().getClass("bsp.work.class", BSP.class),
            job.getConf());
    bsp.setup(bspPeer);
    try{
      bsp.bsp(bspPeer);
    }
    finally{
      try{
        bsp.cleanup(bspPeer);
      finally{
        // Trusting close to not throw exception should we?
        // Will need to check for exception and rethrow it masking 
        // exception from bspPeer.close.
        bspPeer.close();
      }
    }
}
{noformat}

Let me know if you have any comments on it. I shall make necessary changes before I upload
the patch.
                
> BSPTask should periodically ping its parent.
> --------------------------------------------
>
>                 Key: HAMA-498
>                 URL: https://issues.apache.org/jira/browse/HAMA-498
>             Project: Hama
>          Issue Type: Sub-task
>          Components: bsp
>    Affects Versions: 0.4.0
>            Reporter: Edward J. Yoon
>            Assignee: Suraj Menon
>              Labels: newbie
>             Fix For: 0.5.0
>
>
> As described in http://wiki.apache.org/hama/GroomServerFaultTolerance
> BSPTask should periodically ping its parent 'GroomServer' for their health status.
> 1. If Tasks are unable to ping their parent 'GroomServer', it should be killed themselves.
> 2. And, if GroomServer does not receive ping from the childs, GroomServer should check
whether that child is running.
> You don't need to implement recovery logic in this issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message