mesos-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adam Bordelon <a...@mesosphere.io>
Subject Re: Fair sharing question: Should it be among frameworks or among users?
Date Wed, 02 Apr 2014 08:06:30 GMT
Hi Li,

I recently got a chance to dig deeper into the resource allocator code and
want to share what I learned. Mesos already distributes its resource offers
primarily by 'user' and secondarily by framework. However, the 'user'
referred to is the user running/registering the framework, which isn't
really what you're looking for. But I do think Mesos could be adapted to
solve your use case.

Here's how it works now:
1. When a framework registers, it species a user in its FrameworkInfo
("framework-user"). If set to empty string, Mesos will set it to current
user running the framework scheduler.
2. When the master decides which framework to offer its resources to next,
it first looks at which framework-user has the lowest share, then which
framework belonging to that user has the lowest share.
3. When a framework receives an offer, it decides whether to accept or
decline the offer, and what task(s) to run with that offer.
4. When a task runs, it runs as the user specified in the FrameworkInfo.
MESOS-1161 will allow the framework to set a different user per task
("task-user") in the CommandInfo when launching each task, getting us one
step closer to what you want. But even once frameworks can run tasks as
different users, the master allocator still won't know which users want to
launch tasks before it allocates a resource offer to a framework.

This is why Mesos is known as a two level scheduler: first the master
allocator makes a global decision, then framework scheduler decides within
its own realm.
Relying on the master allocator alone could work fine for you if each user
launches a new framework instance for each job they want to run, but does
not work for long-running services that take task requests from multiple
users.
Relying on the framework scheduler could work for you if you only care
about a single framework on the cluster, but won't help you balance users
between multiple frameworks.

If I understand correctly, you want to schedule primarily based on the
resource shares currently allocated to the various users running the tasks,
regardless of how many different frameworks there are, or what users are
running the frameworks themselves. Mesos' DRF sorter/allocator approach
could still be used to provide dominant resource fairness among task-users,
but it would need a few additional pieces of information. First, frameworks
would have to update the master allocator with a list of users wanting to
launch tasks; this could be added to the existing resource request message.
Secondly, the master would need to track the resources allocated to each
user's tasks. Then the master could use the existing DRF algorithm to
select the task-user furthest below his/her fair share, then select which
of that user's frameworks to give the offer to, and pass the user selection
to the selected framework along with the resource offer.

In summary, I do believe that Mesos can be adapted to allocate resource
shares among task-users, but we will need to flesh out the design and
implement it. Personally, I find the idea fascinating. Please share more
information about your use case and requirements, and even file a new JIRA
for this feature if you like.

Thank you,
-Adam-
mesosphere.io


On Tue, Mar 25, 2014 at 9:40 PM, Li Jin <ice.xelloss@gmail.com> wrote:

> Thanks again for the reply.
>
> I should clarify I am considering the case where U1-U10 having unlimited
> requests. And I would like to give freed-up resource to the users (U1-U10)
> with the minimum drs instead of the framework (A,B,C) with mininum drs.
>
> I am curious in how other people think and open to discussion.
>
>
> On Wed, Mar 26, 2014 at 12:12 AM, Chengwei Yang
> <chengwei.yang.cn@gmail.com>wrote:
>
> > On Tue, Mar 25, 2014 at 11:45:18PM -0400, Li Jin wrote:
> > > Thanks for the reply. Still, it's not clear to me how DRF would help in
> > > this case, let me elaborate:
> > >
> > > Let's say there are 3 frameworks A,B,C, running by user F1, F2, F3 and
> > > there are 10 users, U1-U10, running tasks through A,B,C.
> > >
> > > Now use DRF between framework with equal weight, I believe the resource
> > > will be equally distributed among the 3 frameworks. Is it possible for
> > > Mesos to equally distribute the resource among the 10 users?
> >
> > The simple answer is *NO*, DRF isn't a equal partition of resource, no
> > one need equal partition in fact, it always depends on the resource
> > request. The resource allocation is a dynamic progress, not a static
> > partition for all available sources, thinking that there may be only a
> > few users run tasks at the same time slot.
> >
> > For example, if there are more than one user and more than one tasks
> > ready to run. Then, the DRF for user first selects user who has the
> > minimum dominant resource share, if there are more than one tasks (say
> > framework A task, framework B task, ...), then it selects the framework
> > task which has the minimum dominant resource share.
> >
> > BTW, just my understanding from the paper, any mistake please correct
> > me.
> >
> > --
> > Thanks,
> > Chengwei
> >
> > >
> > > Thanks,
> > > Li
> > >
> > >
> > > On Tue, Mar 25, 2014 at 10:39 PM, Chengwei Yang
> > > <chengwei.yang.cn@gmail.com>wrote:
> > >
> > > > On Tue, Mar 25, 2014 at 06:17:11PM -0400, Li Jin wrote:
> > > > > Dear Devs,
> > > > >
> > > > > We are seriously investigating using Mesos as the backbone of our
> > compute
> > > > > infrastructure. One important question I would like to ask is about
> > fair
> > > > > sharing.
> > > > >
> > > > > As I understand it, assuming you have 3 frameworks and 100 users
> > using
> > > > > those frameworks, the current algorithm gives each framework 33%
> > > > (assuming
> > > > > same weight), no matter how many users each framework have. In our
> > case,
> > > >
> > > > I don't think so. By default, DRF allocator used among users and
> user's
> > > > frameworks.
> > > >
> > > > See below options of mesos-master.
> > > >
> > > >   --framework_sorter=VALUE        Policy to use for allocating
> > resources
> > > >                                   between a given user's frameworks.
> > > > Options
> > > >                                   are the same as for user_allocator
> > > > (default: drf)
> > > >
> > > >   --user_sorter=VALUE             Policy to use for allocating
> > resources
> > > >                                   between users. May be one of:
> > > >                                   dominant_resource_fairness (drf)
> > > > (default: drf)
> > > >
> > > > For DRF, please see this paper.
> > > > http://people.csail.mit.edu/matei/papers/2011/nsdi_drf.pdf
> > > >
> > > > --
> > > > Thanks,
> > > > Chengwei
> > > >
> > > > > actually we would like to give each user 1% of the cluster, no
> matter
> > > > which
> > > > > framework they use. The reasons are:
> > > > >
> > > > > (1) It's much easier for us to decide weight between users than
> > weight
> > > > > between framework.
> > > > > (2) It makes it much easy to add and remove frameworks since it
> won't
> > > > > change distribution of fair share
> > > > >
> > > > > In general, I feel frameworks compute on behave of users and thus
> > users
> > > > > should "pay" for the computation.
> > > > >
> > > > > I am wondering if this makes sense and if this is something could
> be
> > > > > supported by Mesos.
> > > > >
> > > > > Thanks,
> > > > > Li
> > > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message