incubator-mesos-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matei Zaharia <>
Subject Re: Question about Mesos.
Date Fri, 01 Jul 2011 17:07:12 GMT
That's a good question. Right now we were planning to provide a wrapper that has the same API
as the resource manager in HNG, so that not only MapReduce but other apps written against
that API will work. We'll see if we run into any unforeseen problems with that.


On Jul 1, 2011, at 3:37 AM, Edward J. Yoon wrote:

> Here's another silly question.
> Mesos plans to add HNG? or will be supported only pure Map/Reduce?
> On Fri, Jul 1, 2011 at 2:15 PM, Ted Dunning <> wrote:
>> Also, both projects are changing in terms of what they do and what they
>> intend to do.
>> For instance, support for long running processes and alternative execution
>> models other than map-reduce is an explicit goal for Yarn.
>> This illustrates how hard it is for anybody to compare systems.  Typically,
>> any given person knows much more about one system than the other leading to
>> many comparison points that are only half true (that half being the one with
>> better information).  This isn't remediable without collaborative discussion
>> between (differently) informed speakers.
>> On Thu, Jun 30, 2011 at 10:10 PM, Edward J. Yoon <>wrote:
>>> Understood.
>>> On Fri, Jul 1, 2011 at 1:59 PM, Matei Zaharia <>
>>> wrote:
>>>> I wouldn't say it's designed for Yahoo! only, but it's definitely meant
>>> to solve issues they saw with large Hadoop clusters (and provides a lot of
>>> value for that).
>>>> Matei
>>>> On Jul 1, 2011, at 12:51 AM, Edward J. Yoon wrote:
>>>>> Hmm, HNG seems designed for their (Y!) own circumstance.
>>>>> On Fri, Jul 1, 2011 at 12:47 PM, Matei Zaharia <>
>>> wrote:
>>>>>> Ted brought up some superficial differences, but if you want to
>>> understand technical differences, there are a bunch of those as well. Mesos
>>> and Hadoop next-gen have similar goals (more efficient resource sharing for
>>> data centers), but they are coming at it from different angles -- HNG is
>>> currently mainly focusing on MapReduce and aims to support other types of
>>> applications too, while Mesos was meant to support a very diverse set of
>>> applications, including long-running services and batch jobs (rather than
>>> only multiple instances of MapReduce), and is in fact being used for that
>>> already. More importantly, HNG is really two pieces -- a refactoring of
>>> MapReduce to allow one instance of MR per application, and a resource
>>> manager called YARN that lets these instances coordinate. We are going to
>>> support having the new MR2 application masters run on top of Mesos instead
>>> of YARN too (and indeed the refactoring is nice because it will enable
>>> Hadoop MapReduce to run on other cluster scheduling systems in the future).
>>>>>> In terms of the technical differences, here are some of the main
>>> currently:
>>>>>> - Mesos is implemented in C++ rather than Java, and has APIs in C++
>>> Python in addition to Java.
>>>>>> - The resource allocation models are different: HNG has a central
>>> scheduler that supports data locality constraints, while Mesos provides
>>> "resource offers" to let applications pick the resources they like according
>>> to other criteria in addition to requests/filters to describe which
>>> resources you want to be offered. Our belief is that resource offers will
>>> allow Mesos to support a wider range of application scheduling needs, while
>>> simultaneously making the system more scalable and highly available
>>> (minimizing the state and work required of the master).
>>>>>> - Mesos can enforce resource isolation through Linux Containers to
>>> guard against misbehaving / greedy tasks.
>>>>>> - HNG supports Kerberos authentication for users.
>>>>>> - HNG can run the MR2 version of Hadoop, while Mesos can run Hadoop
>>> 0.20, Spark and MPI.
>>>>>> - There are some smaller architectural differences that may matter
>>> some applications, such as communication being based on message-passing in
>>> Mesos vs periodic heartbeats in HNG, which allows Mesos to provide lower
>>> scheduling latencies (e.g. to still be efficient if your tasks take 100ms
>>> each).
>>>>>> However, overall, as Ted said, many of these differences will likely
>>> away as both projects add features. What will be interesting is whether some
>>> fundamental differences in the target workloads remain, which I think is
>>> likely to happen. For example, the main deployment of Mesos is currently to
>>> run long-running stream processing services at Twitter, which is something
>>> that typical Hadoop environments just don't do and that requires different
>>> things from the cluster scheduler. I also believe we're going to see a lot
>>> of other cluster scheduling systems besides Mesos and HNG in the future, as
>>> people's requirements for these systems grow. There are some very
>>> challenging problems in designing a general cluster scheduling system that
>>> even the Google folks are still working hard on.
>>>>>> Matei
>>>>>> On Jun 30, 2011, at 6:26 PM, Edward J. Yoon wrote:
>>>>>>> Thanks for your nice and quick explanation!
>>>>>>> On Fri, Jul 1, 2011 at 10:21 AM, Ted Dunning <>
>>> wrote:
>>>>>>>> Technically speaking, Mesos has a less expressive model for
>>> expressing
>>>>>>>> resource requirements.  The thesis of Mesos is that the negotiation
>>> between
>>>>>>>> application and scheduler can make up for this missing information.
>>>  Mesos
>>>>>>>> was also first to "market", but Hadoop nextGen is catching
up fast.
>>>  The
>>>>>>>> MR-279 has code that works, albeit with some issues in production
>>> use.  From
>>>>>>>> all reports, these issues are being resolved quickly as Yahoo's
>>> considerable
>>>>>>>> QA resources come to bear.
>>>>>>>> Politically speaking, Mesos has a nearly inactive mailing
list which,
>>> to
>>>>>>>> outward appearances, indicate a nearly inactive project.
 There is
>>> some
>>>>>>>> evidence that considerable activity is occurring off-list,
but this
>>> is a
>>>>>>>> process bug in the Apache model since "if it doesn't happen
on the
>>> list, it
>>>>>>>> doesn't happen".
>>>>>>>> On the other side, Hadoop nextGen has the Hadoop community
>>> much
>>>>>>>> behind it.  Since HNG has the potential to breakdown some
of the
>>> deadlocks
>>>>>>>> that have plagued the Hadoop community release process, there
>>>>>>>> considerable enthusiasm for it.
>>>>>>>> Combined, these factors make it much more likely that HNG
will be the
>>>>>>>> dominant force in the Hadoop world.  That is, more likely
in my own
>>>>>>>> estimation.  Others may differ.
>>>>>>>> On Thu, Jun 30, 2011 at 5:16 PM, Edward J. Yoon <
>>>>>>>>> Hi,
>>>>>>>>> I'm newbie, and wonder what's the main differences between
>>>>>>>>> nextGen and Mesos.
>>>>>>>>> Thanks.
>>>>>>>>> --
>>>>>>>>> Best Regards, Edward J. Yoon
>>>>>>>>> @eddieyoon
>>>>>>> --
>>>>>>> Best Regards, Edward J. Yoon
>>>>>>> @eddieyoon
>>>>> --
>>>>> Best Regards, Edward J. Yoon
>>>>> @eddieyoon
>>> --
>>> Best Regards, Edward J. Yoon
>>> @eddieyoon
> -- 
> Best Regards, Edward J. Yoon
> @eddieyoon

View raw message