incubator-hama-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Suraj Menon (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HAMA-498) BSPTask should periodically ping its parent.
Date Tue, 21 Feb 2012 13:22:34 GMT

    [ https://issues.apache.org/jira/browse/HAMA-498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13212568#comment-13212568
] 

Suraj Menon commented on HAMA-498:
----------------------------------

Have 2 questions - 
1. What reasonable value for period are we looking at for ping here?  I am currently setting
it at 1 sec. Is it too high or low?
2. BSPPeerChild waits for the completion of the task. Would we be getting rid of this once
we have this feature? If not, how is pinging helping the cause? Say the main logic of BSPTask(or
BSPTaskRunner) hangs, but the pinging thread in BSPTask thread is active. The current code
excerpt looks like this - 

private static class PingGroomServer implements Runnable{
 
    private BSPPeerProtocol pingRPC;
    private TaskAttemptID taskId;
 
    public PingGroomServer(BSPPeerProtocol umbilical, TaskAttemptID id){
      pingRPC = umbilical;
      taskId = id;
    }
 
    @Override
    public void run() {
 
      try {
        LOG.debug("Pinging at time " + Calendar.getInstance().toString());
        pingRPC.ping(taskId);
      } catch (IOException e) {
        LOG.error(
            new StringBuilder("IOException pinging GroomServer from task - ")
            .append(taskId), e);
        //System.exit(1);
      }
      catch (Exception e){
        LOG.error(
            new StringBuilder("Exception pinging GroomServer from task - ")
            .append(taskId), e);
        //System.exit(1);
      }
 
    }
  }
...// body of BSPTask ..
 
this.pingService = Executors.newScheduledThreadPool(1);
 
 
private void startPingingGroom(BSPJob job, BSPPeerProtocol umbilical){
    LOG.debug("Scheduling ping service");
    long pingPeriod = job.getConf().getLong(Constants.GROOM_PING_PERIOD,
        Constants.DEFAULT_GROOM_PING_PERIOD)/2;
    LOG.debug("Scheduling with fixed delay for bsp task" + taskId);
    try{
      if(pingPeriod > 0){
        pingService.scheduleWithFixedDelay(    
            new PingGroomServer(umbilical, taskId),
            0, pingPeriod,TimeUnit.MILLISECONDS);
      }
    }
    catch(Exception e){
      LOG.error("Error scheduling ping service", e);
    }
   
    LOG.debug("Scheduled ping service");
  }
 
  private void stopPingingGroom(){
    if(pingService != null){
      pingService.shutdownNow();
    }
  }

                
> BSPTask should periodically ping its parent.
> --------------------------------------------
>
>                 Key: HAMA-498
>                 URL: https://issues.apache.org/jira/browse/HAMA-498
>             Project: Hama
>          Issue Type: Sub-task
>          Components: bsp
>    Affects Versions: 0.4.0
>            Reporter: Edward J. Yoon
>              Labels: newbie
>             Fix For: 0.5.0
>
>
> As described in http://wiki.apache.org/hama/GroomServerFaultTolerance
> BSPTask should periodically ping its parent 'GroomServer' for their health status.
> 1. If Tasks are unable to ping their parent 'GroomServer', it should be killed themselves.
> 2. And, if GroomServer does not receive ping from the childs, GroomServer should check
whether that child is running.
> You don't need to implement recovery logic in this issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message