incubator-mesos-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matei Zaharia <>
Subject Re: Question about Mesos.
Date Fri, 01 Jul 2011 04:59:05 GMT
I wouldn't say it's designed for Yahoo! only, but it's definitely meant to solve issues they
saw with large Hadoop clusters (and provides a lot of value for that).


On Jul 1, 2011, at 12:51 AM, Edward J. Yoon wrote:

> Hmm, HNG seems designed for their (Y!) own circumstance.
> On Fri, Jul 1, 2011 at 12:47 PM, Matei Zaharia <> wrote:
>> Ted brought up some superficial differences, but if you want to understand technical
differences, there are a bunch of those as well. Mesos and Hadoop next-gen have similar goals
(more efficient resource sharing for data centers), but they are coming at it from different
angles -- HNG is currently mainly focusing on MapReduce and aims to support other types of
applications too, while Mesos was meant to support a very diverse set of applications, including
long-running services and batch jobs (rather than only multiple instances of MapReduce), and
is in fact being used for that already. More importantly, HNG is really two pieces -- a refactoring
of MapReduce to allow one instance of MR per application, and a resource manager called YARN
that lets these instances coordinate. We are going to support having the new MR2 application
masters run on top of Mesos instead of YARN too (and indeed the refactoring is nice because
it will enable Hadoop MapReduce to run on other cluster scheduling systems in the future).
>> In terms of the technical differences, here are some of the main ones currently:
>> - Mesos is implemented in C++ rather than Java, and has APIs in C++ and Python in
addition to Java.
>> - The resource allocation models are different: HNG has a central scheduler that
supports data locality constraints, while Mesos provides "resource offers" to let applications
pick the resources they like according to other criteria in addition to requests/filters to
describe which resources you want to be offered. Our belief is that resource offers will allow
Mesos to support a wider range of application scheduling needs, while simultaneously making
the system more scalable and highly available (minimizing the state and work required of the
>> - Mesos can enforce resource isolation through Linux Containers to guard against
misbehaving / greedy tasks.
>> - HNG supports Kerberos authentication for users.
>> - HNG can run the MR2 version of Hadoop, while Mesos can run Hadoop 0.20, Spark and
>> - There are some smaller architectural differences that may matter for some applications,
such as communication being based on message-passing in Mesos vs periodic heartbeats in HNG,
which allows Mesos to provide lower scheduling latencies (e.g. to still be efficient if your
tasks take 100ms each).
>> However, overall, as Ted said, many of these differences will likely go away as both
projects add features. What will be interesting is whether some fundamental differences in
the target workloads remain, which I think is likely to happen. For example, the main deployment
of Mesos is currently to run long-running stream processing services at Twitter, which is
something that typical Hadoop environments just don't do and that requires different things
from the cluster scheduler. I also believe we're going to see a lot of other cluster scheduling
systems besides Mesos and HNG in the future, as people's requirements for these systems grow.
There are some very challenging problems in designing a general cluster scheduling system
that even the Google folks are still working hard on.
>> Matei
>> On Jun 30, 2011, at 6:26 PM, Edward J. Yoon wrote:
>>> Thanks for your nice and quick explanation!
>>> On Fri, Jul 1, 2011 at 10:21 AM, Ted Dunning <> wrote:
>>>> Technically speaking, Mesos has a less expressive model for expressing
>>>> resource requirements.  The thesis of Mesos is that the negotiation between
>>>> application and scheduler can make up for this missing information.  Mesos
>>>> was also first to "market", but Hadoop nextGen is catching up fast.  The
>>>> MR-279 has code that works, albeit with some issues in production use.  From
>>>> all reports, these issues are being resolved quickly as Yahoo's considerable
>>>> QA resources come to bear.
>>>> Politically speaking, Mesos has a nearly inactive mailing list which, to
>>>> outward appearances, indicate a nearly inactive project.  There is some
>>>> evidence that considerable activity is occurring off-list, but this is a
>>>> process bug in the Apache model since "if it doesn't happen on the list,
>>>> doesn't happen".
>>>> On the other side, Hadoop nextGen has the Hadoop community pretty much
>>>> behind it.  Since HNG has the potential to breakdown some of the deadlocks
>>>> that have plagued the Hadoop community release process, there is
>>>> considerable enthusiasm for it.
>>>> Combined, these factors make it much more likely that HNG will be the
>>>> dominant force in the Hadoop world.  That is, more likely in my own
>>>> estimation.  Others may differ.
>>>> On Thu, Jun 30, 2011 at 5:16 PM, Edward J. Yoon <>wrote:
>>>>> Hi,
>>>>> I'm newbie, and wonder what's the main differences between Hadoop
>>>>> nextGen and Mesos.
>>>>> Thanks.
>>>>> --
>>>>> Best Regards, Edward J. Yoon
>>>>> @eddieyoon
>>> --
>>> Best Regards, Edward J. Yoon
>>> @eddieyoon
> -- 
> Best Regards, Edward J. Yoon
> @eddieyoon

View raw message