aurora-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bill Farner <wfar...@apache.org>
Subject Re: Task Constraints
Date Thu, 17 Jul 2014 20:35:21 GMT
I've taken on the ticket and have a fix posted, hopefully to be committed
today.

-=Bill


On Wed, Jul 16, 2014 at 12:21 PM, Josh Adams <josh@foursquare.com> wrote:

> +Leo Kim who is looking at the compiler error with us.
>
>
> On Wed, Jul 16, 2014 at 8:25 AM, Kevin Burg <kburg@foursquare.com> wrote:
>
> > The idea with the fix is to read the slave's attributes right off the
> > offer rather than going into 'AttributeStore' and keying on the slave's
> > name. The slave's resources are read off the offer in this way, so I
> don't
> > see why it can't be done with attributes as well.
> >
> > Someone who understands all the places where SchedulingFilter.filter is
> > used might be able to fix this better than I can.
> >
> >
> > On Wed, Jul 16, 2014 at 6:40 AM, Josh Adams <josh@foursquare.com> wrote:
> >
> >> Hi there,
> >>
> >> Given that we would need to disrupt running jobs to add constraints in
> >> the future we are blocking on
> >> https://issues.apache.org/jira/browse/AURORA-582 before we can push any
> >> of our services on to Aurora in production.
> >>
> >> Kevin Burg attempted to resolve the related bug
> >> https://issues.apache.org/jira/browse/AURORA-328 by making some changes
> >> here:
> >>
> https://github.com/foursquare/incubator-aurora/commit/b1962fad3fe9ef76954fa107abed25d78b809331
> >> but we seem to be getting a type mismatch when compiling the code.
> >>
> >> Any help and/or info on the bugfix progress would be much appreciated.
> >> Aside from AURORA-582 we are ready to roll (pun intended!)
> >>
> >> Best,
> >> Josh
> >>
> >>
> >> On Mon, Jul 14, 2014 at 11:42 AM, Josh Adams <josh@foursquare.com>
> wrote:
> >>
> >>> Ah, makes sense. We'll try that. Thanks for clarifying this Kevin.
> >>>
> >>> Josh
> >>>
> >>>
> >>> On Mon, Jul 14, 2014 at 11:30 AM, Kevin Sweeney <kevints@apache.org>
> >>> wrote:
> >>>
> >>>> Slaves persist their attributes (including attributes) across restarts
> >>>> due to slave recovery (that's what allows you to upgrade mesos
> in-place
> >>>> without killing the tasks they're managing). Unfortunately to change
> >>>> attributes you need to remove persisted slave metadata (the "meta"
> >>>> directory). This will kill all of a slave's underlying tasks but the
> newly
> >>>> registered slave should have the correct attributes.
> >>>>
> >>>>
> >>>> On Mon, Jul 14, 2014 at 11:26 AM, Kevin Burg <kburg@foursquare.com>
> >>>> wrote:
> >>>>
> >>>>> I've confirmed by looking at that endpoint that new attributes are
> not
> >>>>> being picked up and modified attributes are retaining their old
> values.
> >>>>> This is after restarting both the slaves and the scheduler process.
> >>>>>
> >>>>>
> >>>>> On Mon, Jul 14, 2014 at 11:09 AM, Josh Adams <josh@foursquare.com>
> >>>>> wrote:
> >>>>>
> >>>>> > Thanks Brian. Kevin should have some followup questions shortly.
> >>>>> >
> >>>>> > Josh
> >>>>> >
> >>>>> >
> >>>>> > On Mon, Jul 14, 2014 at 10:37 AM, Brian Wickman <
> wickman@apache.org>
> >>>>> > wrote:
> >>>>> >
> >>>>> >> host/rack should not be treated specially.
> >>>>> >>
> >>>>> >> If you go to the "/slaves" endpoint on the scheduler UI,
what does
> >>>>> it
> >>>>> >> report as attributes being exported by your slaves?  You
might
> want
> >>>>> to
> >>>>> >> validate there that the "staging" attribute got picked
up
> properly.
> >>>>>  If
> >>>>> >> it's not getting picked up (e.g. the attributes are getting
cached
> >>>>> >> incorrectly by the scheduler?) then you should file an
issue.
> >>>>> >>
> >>>>> >>
> >>>>> >> On Fri, Jul 11, 2014 at 5:24 PM, Kevin Burg <kburg@foursquare.com
> >
> >>>>> wrote:
> >>>>> >>
> >>>>> >>> Hi,
> >>>>> >>>
> >>>>> >>> I'm having trouble getting the task constraint resolver
worker
> with
> >>>>> >>> attributes other than 'host' and 'rack.' Are arbitrary
attribute
> >>>>> keys in
> >>>>> >>> the mesos slaves supported currently?
> >>>>> >>>
> >>>>> >>> Here is the setup.
> >>>>> >>>
> >>>>> >>> The slaves are configured to run with
> >>>>> >>> `--attributes=host:<host>;rack:<rack>;staging:true`
> >>>>> >>>
> >>>>> >>> (I've also tried this with staging:1, and staging:foo)
> >>>>> >>>
> >>>>> >>> The constraint generated from the .aurora config looks
like the
> >>>>> following
> >>>>> >>> Constraint(name:staging, constraint:<TaskConstraint
> >>>>> >>> value:ValueConstraint(negated:false, values:[true])>)
> >>>>> >>>
> >>>>> >>> The schedule request then gets vetoed with the following
veto
> >>>>> object:
> >>>>> >>> Veto{reason=Constraint not satisfied: staging, score=1000,
> >>>>> >>> valueMismatch=true}]
> >>>>> >>>
> >>>>> >>> The constraints generated for 'host' and 'rack' look
identical
> >>>>> except for
> >>>>> >>> the different name of course. I've even tried bouncing
every
> mesos
> >>>>> and
> >>>>> >>> aurora process on the machine to see if maybe stale
attributes
> >>>>> were being
> >>>>> >>> assigned to the slaves. All the offers being made to
the master
> >>>>> look
> >>>>> >>> correct though, which leads me to believe that the
constraint
> >>>>> solver just
> >>>>> >>> doesn't work for arbitrary attributes.
> >>>>> >>>
> >>>>> >>> We would appreciate any help you can offer.
> >>>>> >>>
> >>>>> >>> Thanks,
> >>>>> >>> Kevin
> >>>>> >>>
> >>>>> >>
> >>>>> >>
> >>>>> >
> >>>>> >
> >>>>> > --
> >>>>> > ===============
> >>>>> > josh adams
> >>>>> > production engineer
> >>>>> > foursquare
> >>>>> >
> >>>>> > (gv) 415-830-4106
> >>>>> > ===============
> >>>>> > foursquare.com/jobs
> >>>>> >
> >>>>>
> >>>>
> >>>>
> >>>
> >>>
> >>> --
> >>> ===============
> >>> josh adams
> >>> production engineer
> >>> foursquare
> >>>
> >>> (gv) 415-830-4106
> >>> ===============
> >>> foursquare.com/jobs
> >>>
> >>
> >>
> >>
> >> --
> >> ===============
> >> josh adams
> >> production engineer
> >> foursquare
> >>
> >> (gv) 415-830-4106
> >> ===============
> >> foursquare.com/jobs
> >>
> >
> >
>
>
> --
> ===============
> josh adams
> production engineer
> foursquare
>
> (gv) 415-830-4106
> ===============
> foursquare.com/jobs
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message