hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Milind Bhandarkar (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-2911) Hamster: Hadoop And Mpi on the same cluSTER
Date Wed, 31 Aug 2011 03:27:10 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-2911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13094277#comment-13094277
] 

Milind Bhandarkar commented on MAPREDUCE-2911:
----------------------------------------------

Progress report (I wish more people do this on a daily basis, for the Jiras they are working
on.):

Had a great conf call with Jeff Squyres and Ralph Castain (both at Cisco, OpenMPI stewards).
Both were excited that openMPI is getting hadoop co-existence. They suggested that I base
Hamster implementation on the "direct-launch" model, such as with slurmd. (Having used slurm
in the past, I understand why :-)

So, the design described above has changed. Now, the way to launch MPI jobs on a hadoop cluster
is simply:

{code}
hamster -np 32 a.out
{code}

"hamster" is a client application that connects to the RM, asks it to create an AM with 32
containers, and after all of those are launched, executed a.out inside them, after setting
some environment variables for connecting to the AM to get the "node list".

That way, there are no security holes (as described above), since the MPI cluster exists only
for the duration of the job.

@Vinod, If you remember, I had sent an email on a Y-internal mailing list that HoD will make
a comeback. This is it. HoD was loved very much, especially after you and Hemanth took over,
in terms of stability. As I had said in that email, once the container abstraction is solidified,
HoD will make a comeback. So, here it is. (Afterall, creating a slice of a shared resource
has worked for the last 40 years, why won't it work now ?)

> Hamster: Hadoop And Mpi on the same cluSTER
> -------------------------------------------
>
>                 Key: MAPREDUCE-2911
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2911
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mrv2
>    Affects Versions: 0.23.0
>         Environment: All Unix-Environments
>            Reporter: Milind Bhandarkar
>            Assignee: Milind Bhandarkar
>             Fix For: 0.23.0
>
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> MPI is commonly used for many machine-learning applications. OpenMPI (http://www.open-mpi.org/)
is a popular BSD-licensed version of MPI. In the past, running MPI application on a Hadoop
cluster was achieved using Hadoop Streaming (http://videolectures.net/nipsworkshops2010_ye_gbd/),
but it was kludgy. After the resource-manager separation from JobTracker in Hadoop, we have
all the tools needed to make MPI a first-class citizen on a Hadoop cluster. I am currently
working on the patch to make MPI an application-master. Initial version of this patch will
be available soon (hopefully before September 10.) This jira will track the development of
Hamster: The application master for MPI.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message