hama-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thomas Jungblut (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HAMA-431) MapReduce NG integration
Date Sun, 11 Sep 2011 14:53:09 GMT

    [ https://issues.apache.org/jira/browse/HAMA-431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13102284#comment-13102284

Thomas Jungblut commented on HAMA-431:

Wow that's a wall of text :D

I'm no contributor (yet?), so I don't have SVN access, that was the main reason I choose the
Google Code repo. 
Yes we took a lot of Hadoop's old code for HamaV1, in these days we don't have failure recovery,
detection should be on it's way (HAMA-370).

*Fault tolerance* in HamaV2 should basically just check if a container is available through
some kind of heartbeat. If a task isn't responding, we should roll back to the state it was
before. The Task is responsible for state saving every superstep e.G. the messages received
by other peers. This should be planted in HDFS along with the task-id so the AM can rerun
the task with this input. -> we need some kind of task attempts.

*Implementation of barrier synchronization:*
I would be very glad if we can get away from Zookeepers Sync service, we had a lot of ideas
how to make it running (see HAMA-387) but it doesn't help. Edward asked a question on their
user list, but they offered just the same ideas we have tried out before. 

This should be agreeable, no?

Polling is totally agreeable. I very much doubt that Zookeeper isn't internally polling either.

Reuse of MRV2 classes

As you might see I totally reuse your classes. It's cool, but it is more work to cut down
your statemachine handling to something simpler than rewriting it from scratch.

I do clearly see that we should re-use MRV2 components like ContainerLauncher (launches containers
on nodes), RMContainerAllocator(requests containers from ResourceManager), I'll see how we
can move these to a separate common library module from MRV2 so that Hama(and possibly others)
can use them.

+1, that would be great.

Instead of jumping into writing the implementation,I think it helps to spend some time developing
the design till it reaches some level of stability and then writing down the module structure

You are right.

> MapReduce NG integration
> ------------------------
>                 Key: HAMA-431
>                 URL: https://issues.apache.org/jira/browse/HAMA-431
>             Project: Hama
>          Issue Type: New Feature
>            Reporter: Thomas Jungblut
>            Assignee: Thomas Jungblut
> We should take a look at how to integrate Hama's BSP Engine to Hadoop's nextGen application
> Can be currently found in the 0.23 branch.

This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message