aurora-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kevin Burg <kb...@foursquare.com>
Subject Re: Task Constraints
Date Mon, 14 Jul 2014 19:35:35 GMT
Removing the meta directory does not fix the issue. Upon further
inspection, the scheduler seems to be using very old slave ids. These slave
ids aren't even in "mesos/slave/workdir/slaves" anymore. I should add that
the "/offers" endpoint on the scheduler shows all the up to date
information including correct slave_ids and attributes.

The slaves are not failing and logging during any of these attribute
changes.


On Mon, Jul 14, 2014 at 12:12 PM, Bill Farner <wfarner@apache.org> wrote:

> However, the slave should be failing and logging this (rather than
> silently working with old attributes).  If you find otherwise, you should
> file a bug against mesos.
>
>
> On Monday, July 14, 2014, Josh Adams <josh@foursquare.com> wrote:
>
>> Ah, makes sense. We'll try that. Thanks for clarifying this Kevin.
>>
>> Josh
>>
>>
>> On Mon, Jul 14, 2014 at 11:30 AM, Kevin Sweeney <kevints@apache.org>
>> wrote:
>>
>> > Slaves persist their attributes (including attributes) across restarts
>> due
>> > to slave recovery (that's what allows you to upgrade mesos in-place
>> without
>> > killing the tasks they're managing). Unfortunately to change attributes
>> you
>> > need to remove persisted slave metadata (the "meta" directory). This
>> will
>> > kill all of a slave's underlying tasks but the newly registered slave
>> > should have the correct attributes.
>> >
>> >
>> > On Mon, Jul 14, 2014 at 11:26 AM, Kevin Burg <kburg@foursquare.com>
>> wrote:
>> >
>> >> I've confirmed by looking at that endpoint that new attributes are not
>> >> being picked up and modified attributes are retaining their old values.
>> >> This is after restarting both the slaves and the scheduler process.
>> >>
>> >>
>> >> On Mon, Jul 14, 2014 at 11:09 AM, Josh Adams <josh@foursquare.com>
>> wrote:
>> >>
>> >> > Thanks Brian. Kevin should have some followup questions shortly.
>> >> >
>> >> > Josh
>> >> >
>> >> >
>> >> > On Mon, Jul 14, 2014 at 10:37 AM, Brian Wickman <wickman@apache.org>
>> >> > wrote:
>> >> >
>> >> >> host/rack should not be treated specially.
>> >> >>
>> >> >> If you go to the "/slaves" endpoint on the scheduler UI, what does
>> it
>> >> >> report as attributes being exported by your slaves?  You might
want
>> to
>> >> >> validate there that the "staging" attribute got picked up properly.
>>  If
>> >> >> it's not getting picked up (e.g. the attributes are getting cached
>> >> >> incorrectly by the scheduler?) then you should file an issue.
>> >> >>
>> >> >>
>> >> >> On Fri, Jul 11, 2014 at 5:24 PM, Kevin Burg <kburg@foursquare.com>
>> >> wrote:
>> >> >>
>> >> >>> Hi,
>> >> >>>
>> >> >>> I'm having trouble getting the task constraint resolver worker
with
>> >> >>> attributes other than 'host' and 'rack.' Are arbitrary attribute
>> keys
>> >> in
>> >> >>> the mesos slaves supported currently?
>> >> >>>
>> >> >>> Here is the setup.
>> >> >>>
>> >> >>> The slaves are configured to run with
>> >> >>> `--attributes=host:<host>;rack:<rack>;staging:true`
>> >> >>>
>> >> >>> (I've also tried this with staging:1, and staging:foo)
>> >> >>>
>> >> >>> The constraint generated from the .aurora config looks like
the
>> >> following
>> >> >>> Constraint(name:staging, constraint:<TaskConstraint
>> >> >>> value:ValueConstraint(negated:false, values:[true])>)
>> >> >>>
>> >> >>> The schedule request then gets vetoed with the following veto
>> object:
>> >> >>> Veto{reason=Constraint not satisfied: staging, score=1000,
>> >> >>> valueMismatch=true}]
>> >> >>>
>> >> >>> The constraints generated for 'host' and 'rack' look identical
>> except
>> >> for
>> >> >>> the different name of course. I've even tried bouncing every
mesos
>> and
>> >> >>> aurora process on the machine to see if maybe stale attributes
were
>> >> being
>> >> >>> assigned to the slaves. All the offers being made to the master
>> look
>> >> >>> correct though, which leads me to believe that the constraint
>> solver
>> >> just
>> >> >>> doesn't work for arbitrary attributes.
>> >> >>>
>> >> >>> We would appreciate any help you can offer.
>> >> >>>
>> >> >>> Thanks,
>> >> >>> Kevin
>> >> >>>
>> >> >>
>> >> >>
>> >> >
>> >> >
>> >> > --
>> >> > ===============
>> >> > josh adams
>> >> > production engineer
>> >> > foursquare
>> >> >
>> >> > (gv) 415-830-4106
>> >> > ===============
>> >> > foursquare.com/jobs
>> >> >
>> >>
>> >
>> >
>>
>>
>> --
>> ===============
>> josh adams
>> production engineer
>> foursquare
>>
>> (gv) 415-830-4106
>> ===============
>> foursquare.com/jobs
>>
>
>
> --
> -=Bill
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message