Return-Path: X-Original-To: apmail-aurora-dev-archive@minotaur.apache.org Delivered-To: apmail-aurora-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id F28BF109EB for ; Wed, 16 Jul 2014 19:23:54 +0000 (UTC) Received: (qmail 39327 invoked by uid 500); 16 Jul 2014 19:23:54 -0000 Delivered-To: apmail-aurora-dev-archive@aurora.apache.org Received: (qmail 39278 invoked by uid 500); 16 Jul 2014 19:23:54 -0000 Mailing-List: contact dev-help@aurora.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@aurora.incubator.apache.org Delivered-To: mailing list dev@aurora.incubator.apache.org Delivered-To: moderator for dev@aurora.incubator.apache.org Received: (qmail 32699 invoked by uid 99); 16 Jul 2014 19:22:04 -0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of josh@foursquare.com designates 209.85.220.177 as permitted sender) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=foursquare.com; s=google; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type; bh=qbJe8bpObJUTzbqbqWgkQzUAHVQgHVubDyVjnx4q/Zs=; b=lbuxT8lW/729zf4fZuRPoT3ZllV3FZqILtzBznzJtInReWqfnqPcIKZMgBRQu09AUa YvCdtW3JzlDthKCNA9U4iiHprqc9ZIKxqJ1eUmgCdsXusNQmGAg+dtctGIvo2I99RpK9 UnQMW7p4gPCxh9maW9uHUgjU279xgi2Boi7ZM= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc:content-type; bh=qbJe8bpObJUTzbqbqWgkQzUAHVQgHVubDyVjnx4q/Zs=; b=YBEcdrRyn//OZ9jKExh2H+DrEGuBaplMAl2Rrekk/mzK6DJbuD8Cpo9cDiVogVNtQi xXhbeQ+0LD26MprXjXp9pft0WwsgbU3deJO8JSTd21YDLqe3NDQOplJ3nuafv6/RcLLe 0rPn3q+o+8pVc92E9/G2+tv33u1ogHAjMn13X5tHco3qVqXPpOKT6OJhmYfcsdyKbC0Y b5zuthLWL9GoZiJ9IcO3yFQngBfLPVlU6pN3WukvXQ7OAnUB+PofLPBRcpEcUW8BIZ4z GnAreJhs/aJss1ltnn5uxlamIoMLR+A+kUBYOoeBuKETA54ZHkC3iRL5cxaycWdiATp2 hEuQ== X-Gm-Message-State: ALoCoQlb5URAImzFaKSmPmaP5rXAShwlrDOPiptk1BeDeENQCQbSZLFH9PMyfUlQKOgTpi8JUyeV X-Received: by 10.52.77.97 with SMTP id r1mr16254846vdw.31.1405538499485; Wed, 16 Jul 2014 12:21:39 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: References: From: Josh Adams Date: Wed, 16 Jul 2014 12:21:24 -0700 Message-ID: Subject: Re: Task Constraints To: Kevin Burg Cc: Kevin Sweeney , Aurora , Brian Wickman , Leo Kim Content-Type: multipart/alternative; boundary=20cf3071cfc6fb326e04fe546e1a X-Virus-Checked: Checked by ClamAV on apache.org --20cf3071cfc6fb326e04fe546e1a Content-Type: text/plain; charset=UTF-8 +Leo Kim who is looking at the compiler error with us. On Wed, Jul 16, 2014 at 8:25 AM, Kevin Burg wrote: > The idea with the fix is to read the slave's attributes right off the > offer rather than going into 'AttributeStore' and keying on the slave's > name. The slave's resources are read off the offer in this way, so I don't > see why it can't be done with attributes as well. > > Someone who understands all the places where SchedulingFilter.filter is > used might be able to fix this better than I can. > > > On Wed, Jul 16, 2014 at 6:40 AM, Josh Adams wrote: > >> Hi there, >> >> Given that we would need to disrupt running jobs to add constraints in >> the future we are blocking on >> https://issues.apache.org/jira/browse/AURORA-582 before we can push any >> of our services on to Aurora in production. >> >> Kevin Burg attempted to resolve the related bug >> https://issues.apache.org/jira/browse/AURORA-328 by making some changes >> here: >> https://github.com/foursquare/incubator-aurora/commit/b1962fad3fe9ef76954fa107abed25d78b809331 >> but we seem to be getting a type mismatch when compiling the code. >> >> Any help and/or info on the bugfix progress would be much appreciated. >> Aside from AURORA-582 we are ready to roll (pun intended!) >> >> Best, >> Josh >> >> >> On Mon, Jul 14, 2014 at 11:42 AM, Josh Adams wrote: >> >>> Ah, makes sense. We'll try that. Thanks for clarifying this Kevin. >>> >>> Josh >>> >>> >>> On Mon, Jul 14, 2014 at 11:30 AM, Kevin Sweeney >>> wrote: >>> >>>> Slaves persist their attributes (including attributes) across restarts >>>> due to slave recovery (that's what allows you to upgrade mesos in-place >>>> without killing the tasks they're managing). Unfortunately to change >>>> attributes you need to remove persisted slave metadata (the "meta" >>>> directory). This will kill all of a slave's underlying tasks but the newly >>>> registered slave should have the correct attributes. >>>> >>>> >>>> On Mon, Jul 14, 2014 at 11:26 AM, Kevin Burg >>>> wrote: >>>> >>>>> I've confirmed by looking at that endpoint that new attributes are not >>>>> being picked up and modified attributes are retaining their old values. >>>>> This is after restarting both the slaves and the scheduler process. >>>>> >>>>> >>>>> On Mon, Jul 14, 2014 at 11:09 AM, Josh Adams >>>>> wrote: >>>>> >>>>> > Thanks Brian. Kevin should have some followup questions shortly. >>>>> > >>>>> > Josh >>>>> > >>>>> > >>>>> > On Mon, Jul 14, 2014 at 10:37 AM, Brian Wickman >>>>> > wrote: >>>>> > >>>>> >> host/rack should not be treated specially. >>>>> >> >>>>> >> If you go to the "/slaves" endpoint on the scheduler UI, what does >>>>> it >>>>> >> report as attributes being exported by your slaves? You might want >>>>> to >>>>> >> validate there that the "staging" attribute got picked up properly. >>>>> If >>>>> >> it's not getting picked up (e.g. the attributes are getting cached >>>>> >> incorrectly by the scheduler?) then you should file an issue. >>>>> >> >>>>> >> >>>>> >> On Fri, Jul 11, 2014 at 5:24 PM, Kevin Burg >>>>> wrote: >>>>> >> >>>>> >>> Hi, >>>>> >>> >>>>> >>> I'm having trouble getting the task constraint resolver worker with >>>>> >>> attributes other than 'host' and 'rack.' Are arbitrary attribute >>>>> keys in >>>>> >>> the mesos slaves supported currently? >>>>> >>> >>>>> >>> Here is the setup. >>>>> >>> >>>>> >>> The slaves are configured to run with >>>>> >>> `--attributes=host:;rack:;staging:true` >>>>> >>> >>>>> >>> (I've also tried this with staging:1, and staging:foo) >>>>> >>> >>>>> >>> The constraint generated from the .aurora config looks like the >>>>> following >>>>> >>> Constraint(name:staging, constraint:>>>> >>> value:ValueConstraint(negated:false, values:[true])>) >>>>> >>> >>>>> >>> The schedule request then gets vetoed with the following veto >>>>> object: >>>>> >>> Veto{reason=Constraint not satisfied: staging, score=1000, >>>>> >>> valueMismatch=true}] >>>>> >>> >>>>> >>> The constraints generated for 'host' and 'rack' look identical >>>>> except for >>>>> >>> the different name of course. I've even tried bouncing every mesos >>>>> and >>>>> >>> aurora process on the machine to see if maybe stale attributes >>>>> were being >>>>> >>> assigned to the slaves. All the offers being made to the master >>>>> look >>>>> >>> correct though, which leads me to believe that the constraint >>>>> solver just >>>>> >>> doesn't work for arbitrary attributes. >>>>> >>> >>>>> >>> We would appreciate any help you can offer. >>>>> >>> >>>>> >>> Thanks, >>>>> >>> Kevin >>>>> >>> >>>>> >> >>>>> >> >>>>> > >>>>> > >>>>> > -- >>>>> > =============== >>>>> > josh adams >>>>> > production engineer >>>>> > foursquare >>>>> > >>>>> > (gv) 415-830-4106 >>>>> > =============== >>>>> > foursquare.com/jobs >>>>> > >>>>> >>>> >>>> >>> >>> >>> -- >>> =============== >>> josh adams >>> production engineer >>> foursquare >>> >>> (gv) 415-830-4106 >>> =============== >>> foursquare.com/jobs >>> >> >> >> >> -- >> =============== >> josh adams >> production engineer >> foursquare >> >> (gv) 415-830-4106 >> =============== >> foursquare.com/jobs >> > > -- =============== josh adams production engineer foursquare (gv) 415-830-4106 =============== foursquare.com/jobs --20cf3071cfc6fb326e04fe546e1a--