aurora-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bill Farner (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (AURORA-1109) Add mesos role feature
Date Thu, 31 Dec 2015 01:14:49 GMT

    [ https://issues.apache.org/jira/browse/AURORA-1109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15075587#comment-15075587
] 

Bill Farner commented on AURORA-1109:
-------------------------------------

{quote}
I propose we start from the first case, but keep an eye to make sure reconfiguring policy
could be supported.
{quote}

+1

{quote}
Are you referring to Mesos role or Aurora's role concept here?
{quote}

Mesos roles.  When you get into the code, this will probably make more sense.  Specifically,
{{ResourceSlot.toResourceList}} will need the {{Offer}} for context to construct the preferred
{{List<Protos.Resource>}}.

> Add mesos role  feature
> -----------------------
>
>                 Key: AURORA-1109
>                 URL: https://issues.apache.org/jira/browse/AURORA-1109
>             Project: Aurora
>          Issue Type: Story
>          Components: Scheduler
>            Reporter: zhanglong
>            Assignee: zhanglong
>
> Problems
> We are from eBay platform team. Previously, we used marathon to generate Jenkins master
instance in dedicated vms and recieve resource offer from same dedicated vms. For the details,
please refer to
> http://www.ebaytechblog.com/2014/04/04/delivering-ebays-ci-solution-with-apache-mesos-part-i/#.VNQUuC6_SPU
> Now, we found Aurora is more stable and powerful. We are moving from Marathon to Aurora.
During the move, we found there is no mesos role in Aurora now. But we need use mesos role
way to solve the problem in section "Frameworks stopped receiving offers after a while" of
the given url.
> Here is a snippet of the problem description:
> We noticed occurred after we used Marathon to create the initial set of CI masters. As
those CI masters started registering themselves as frameworks, Marathon stopped receiving
any offers from Mesos; essentially, no new CI masters could be launched. Let’s start with
Marathon. In the DRF model, it was unfair to treat Marathon in the same bucket/role alongside
hundreds of connected Jenkins frameworks. After launching all these Jenkins frameworks, Marathon
had a large resource share and Mesos would aggressively offer resources to frameworks that
were using little or no resources. Marathon was placed last in priority and got starved out.
> We decided to define a dedicated Mesos role for Marathon and to have all of the Mesos
slaves that were reserved for Jenkins master instances support that Mesos role. Jenkins frameworks
were left with the default role “”.* This solved the problem – Mesos offered resources
per role and hence Marathon never got starved out. A framework with a special role will get
resource offers from both slaves supporting that special role and also from the default role
“”.** However, since we were using placement constraints, Marathon accepted resource offers
only from slaves that supported both the role and the placement constraints.*
> Solution
> So we add role feature is the source code to solve the problem in same way: When accept
a resource offer, Aurora will send back the needed resources to Mesos with the mesos role
in resource offer.
> How to configure the Mesos role:
> 1.Add cmd option --mesos_role=${Mesos role name} when start Aurora scheduler.
> We change the test cases according code change. Each changed test case is green



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message