incubator-mesos-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Benjamin Hindman <b...@EECS.Berkeley.EDU>
Subject Mesos Updates
Date Wed, 11 May 2011 18:34:00 GMT
Hi All!

I've been rather silently over the past few months focusing on Mesos. In particular, I have
been working at Twitter to help get Mesos deployed and used. I'm thrilled to say that Twitter
is invested in seeing the project succeed internally and in the open source community!

There has been a bunch of progress over the past few months that I'm happy to report. I thought
I would send a quick "state of the union" report on Mesos at Twitter and discuss what I think
is necessary to accomplish for our "first" Apache release.

Twitter has three different clusters running Mesos, a "test" cluster, a "non-production" (nonprod)
cluster, and a "production" (prod) cluster. The test cluster is where I incubate new versions
of Mesos before they get cascaded through nonprod and prod. The nonprod cluster is mostly
used for (1) experimental new services that are being developed internally and (2) load tests.
And the prod cluster is being used by numerous "streaming" services that perform different
tasks based on data that they are ingesting (for example, these services get data off of the
internal equivalent of the Twitter "firehose"). Only a few of the services running in prod
and non-prod have daemon style "always up" requirements, but the uptimes have been looking
great as of late! There are some promising objectives right around for the corner for Mesos
at Twitter, and I'm even more excited to report on those once they happen! This includes running
Hadoop on Mesos (not the primary reason Twitter was excited about Mesos in the first place),
as well as some rather "important" internal Twitter services ... stay tuned! ;)

There is still lots to be done (which I'll discuss briefly below), but that being said I'd
love to shoot for our first Apache release date of early June. I'm not sure the exact protocol
for this ... 

There are a few upcoming features that I wanted to hold out on for the first release (all
of which are being worked on):
(1) Eliminating SWIG as a dependency for the webui (the biggest blocker I've noticed for people
downloading and installing/running the system).
(2) Providing task history information.
(3) Handling slave upgrades/failures (without killing the running tasks).
(4) Launching schedulers via the master and persisting task information across failures.
(5) Implementing our resource hints mechanism, which has been renamed to "requests".

Two more things that I'd like to take care of/understand:

(*) What needs to occur when the time comes to offer some contributors roles as committers?

(*) It sounds like Matei has gotten our SVN stuff all setup, so we can bring the code in from
Github. I'm still a big fan of providing access to the code via Github however, I think it's
a low barrier of entry to get developers to download, read, and play with the code very easily.
I'm not sure how other projects do it, but I've been told that some projects share a presence
on both Github and Apache SVN?

If you got all the way through this email, thanks! I'm excited to see Mesos take the next


P.S. It appears I'm not on ... I guess I need to add myself?
View raw message