incubator-hama-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thomas Jungblut (JIRA)" <j...@apache.org>
Subject [jira] [Issue Comment Edited] (HAMA-387) Advanced Barrier Synchronization
Date Sat, 24 Sep 2011 06:02:26 GMT

    [ https://issues.apache.org/jira/browse/HAMA-387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13113900#comment-13113900
] 

Thomas Jungblut edited comment on HAMA-387 at 9/24/11 6:00 AM:
---------------------------------------------------------------

Well, in our bsppeer code the enter and leave barrier methods are just two RPC calls. This
is cleaner than the whole sync and notify of ZK Nodes.
In addition we have our own sync service, which can now keep track of Superstep and additional
information if we want to keep it there. For example which tasks are currently within the
barrier. So we don't need zookeeper at all.
And we possibly could de-register task, so we can adjust the number of tasks that are need
to trip the barrier during runtime. So we could add another method which is some kind of waitToHalt(),
which deregisters the task from the sync service.

Besides that, I think this is faster than ZK barrier sync.
So to summarize, we would have full control, it is our code, no dependency. It is cleaner
and we can implement new features easier with it.

And I guess I take the sync code for the MR NG integration, just because it is its own service
and I don't want to debug the BSPPeer barrier code.

      was (Author: thomas.jungblut):
    Well, in our groom code the enter and leave barrier methods are just two RPC calls. This
is cleaner than the whole sync and notify of ZK Nodes.
In addition we have our own sync service, which can now keep track of Superstep and additional
information if we want to keep it there. For example which tasks are currently within the
barrier. So we don't need zookeeper at all.
And we possibly could de-register task, so we can adjust the number of tasks that are need
to trip the barrier during runtime. So we could add another method which is some kind of waitToHalt(),
which deregisters the task from the sync service.

Besides that, I think this is faster than ZK barrier sync.
So to summarize, we would have full control, it is our code, no dependency. It is cleaner
and we can implement new features easier with it.

And I guess I take the sync code for the MR NG integration, just because it is its own service
and I don't want to debug the BSPPeer barrier code.
  
> Advanced Barrier Synchronization
> --------------------------------
>
>                 Key: HAMA-387
>                 URL: https://issues.apache.org/jira/browse/HAMA-387
>             Project: Hama
>          Issue Type: Improvement
>          Components: bsp
>    Affects Versions: 0.3.0
>            Reporter: Edward J. Yoon
>            Assignee: ChiaHung Lin
>             Fix For: 0.4.0
>
>         Attachments: HAMA-387.patch, HAMA-387_v02.patch, HAMA-387_v03.patch, HAMA-387_v04.patch,
doublebarrier.patch, new.patch, ownSyncService.patch, ownSyncService_v2.patch, ownSyncService_v3.patch,
sleepless.patch, x.PNG, x.patch
>
>
> I think, the lock file must include:
>  * the job ID
>  * the task ID of the lock file owner
>  * the current superstep count
> to check ownership and validation.
> Currently they are named by hostname, but multi-tasks can be run per one groomserver
in the future. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message