hama-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Suraj Menon (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HAMA-567) BSPPeer should provide means for chaining supersteps to share data among them.
Date Wed, 02 May 2012 20:38:50 GMT

    [ https://issues.apache.org/jira/browse/HAMA-567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13266887#comment-13266887
] 

Suraj Menon commented on HAMA-567:
----------------------------------

I was looking for something in lines of how we save reference to an object in an HttpSession
( http://docs.oracle.com/javaee/5/api/javax/servlet/http/HttpSession.html#setAttribute(java.lang.String,
java.lang.Object) ) across multiple requests. Let's keep aside remote access to peers here
for a moment. First requirement to meet is to how the supersteps running in the same peer
could use the output of a previous superstep. Say you have defined 10 superstep classes to
run in tandem for the job. What if the 10th superstep needs information that 1st superstep
had computed. To prevent these values to go out of scope, the user today would have to create
singleton(s) for every object they want to share across supersteps. Even if we use any other
distributed caching framework, what if the value you are interested in the 10th superstep
is not something that could be accessed by hard-coded references and could only be inferred
in one of the previous supersteps?

Introducing generics would be restrictive. I might want to save reference to a DiskQueue or
just a string or List of Integers, etc. This would be difficult to achieve. Heap-usage is
something that a programmer always(or expected to) keep in mind for solutions on huge data-sets.
This map is intended to hold references to only bunch of already instantiated objects and
only if needed. 
                
> BSPPeer should provide means for chaining supersteps to share data among them.
> ------------------------------------------------------------------------------
>
>                 Key: HAMA-567
>                 URL: https://issues.apache.org/jira/browse/HAMA-567
>             Project: Hama
>          Issue Type: Improvement
>          Components: bsp core
>    Affects Versions: 0.6.0
>            Reporter: Suraj Menon
>             Fix For: 0.6.0
>
>
> In most scenarios, a superstep would need certain values or objects that were computed
in the previous superstep. When using the chaining Superstep design to implement BSP algorithms,
this gets a little ugly/difficult to implement. BSPPeer should provide means (preferably a
map<String,Object>) so that the next Superstep can ask for the values in previous superstep
using String token to query the map. Also, this map could be checkpointed periodically in
the background so that we can completely recover the state of a task after failure. The BSPPeer
object should have a dedicated get and set function for updating values in the peer.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message