Return-Path: X-Original-To: apmail-aurora-dev-archive@minotaur.apache.org Delivered-To: apmail-aurora-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id C27D019901 for ; Tue, 5 Apr 2016 17:53:17 +0000 (UTC) Received: (qmail 47260 invoked by uid 500); 5 Apr 2016 17:53:17 -0000 Delivered-To: apmail-aurora-dev-archive@aurora.apache.org Received: (qmail 47203 invoked by uid 500); 5 Apr 2016 17:53:17 -0000 Mailing-List: contact dev-help@aurora.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@aurora.apache.org Delivered-To: mailing list dev@aurora.apache.org Received: (qmail 47192 invoked by uid 99); 5 Apr 2016 17:53:17 -0000 Received: from mail-relay.apache.org (HELO mail-relay.apache.org) (140.211.11.15) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 05 Apr 2016 17:53:17 +0000 Received: from mail-io0-f180.google.com (mail-io0-f180.google.com [209.85.223.180]) by mail-relay.apache.org (ASF Mail Server at mail-relay.apache.org) with ESMTPSA id 55B6D1A012C for ; Tue, 5 Apr 2016 17:53:17 +0000 (UTC) Received: by mail-io0-f180.google.com with SMTP id g185so27996646ioa.2 for ; Tue, 05 Apr 2016 10:53:17 -0700 (PDT) X-Gm-Message-State: AD7BkJJtS5inioeWElkyhnqDBpP4BQ6seKbgW6tbL4Ht7PPTTOxdHAkO/GuHx+lQxCymc2/RFSCzqGyDjMk5pg== MIME-Version: 1.0 X-Received: by 10.107.3.149 with SMTP id e21mr187454ioi.46.1459878796776; Tue, 05 Apr 2016 10:53:16 -0700 (PDT) Received: by 10.107.143.11 with HTTP; Tue, 5 Apr 2016 10:53:16 -0700 (PDT) In-Reply-To: References: <1457341237638.42001@blue-yonder.com> <1458938520469.43572@blue-yonder.com> <1459416731342.74564@blue-yonder.com> <1459864177049.68954@blue-yonder.com> Date: Tue, 5 Apr 2016 10:53:16 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: Populate DiscoveryInfo in Mesos From: Maxim Khutornenko To: dev@aurora.apache.org Content-Type: text/plain; charset=UTF-8 Left a few comments in the RB. I am +1 on this change overall. On Tue, Apr 5, 2016 at 10:51 AM, Zhitao Li wrote: > I just updated it one more time this morning and all previous comments have > been addressed. > > Given that this is an opt-in experimental feature (no effect by default), I > believe it's ready to be landed. > > Looking forward to more use cases from this :) > > On Tue, Apr 5, 2016 at 10:47 AM, Zameer Manji wrote: > >> I have no concerns, overall I like this change because users of Aurora can >> use other tools available in the mesos ecosystem. It would be nice if it >> could land before the RC planned for tomorrow. >> >> On Tue, Apr 5, 2016 at 6:49 AM, Erb, Stephan >> wrote: >> >> > There has been some promising progress on the review request: >> > https://reviews.apache.org/r/45177/ >> > >> > Has anyone else comments, or identified a blocking issue? Otherwise, this >> > beta-feature is close to merging, probably even before the RC planned for >> > tomorrow. >> > ________________________________________ >> > From: Zhitao Li >> > Sent: Friday, April 1, 2016 03:10 >> > To: dev@aurora.apache.org >> > Subject: Re: Populate DiscoveryInfo in Mesos >> > >> > Benjamin, >> > >> > You are exactly right. The problem is on Mesos DNS side because it has >> its >> > own rules of shortening names and replacing dots to other characters. >> > >> > IMO, relying one generating one "name" which would be useful for all >> > systems may be idealistic. I like the "label" concept in recent >> > Mesos/Docker systems, and probably Mesos DNS should take an optional >> label >> > to allow its user to customize the behavior, and Aurora could easily >> adopt >> > that: e.g. duplicate labels from TaskInfo to DiscoveryInfo. >> > >> > Right now, the only open sourced project using DiscoveryInfo is Mesos >> DNS, >> > so there is not real convention in the community yet. >> > >> > >> > On Thu, Mar 31, 2016 at 5:39 PM, benley@gmail.com >> > wrote: >> > >> > > FYI, Aurora already populates the "executor source" field (not sure >> > exactly >> > > what that corresponds to in mesos.proto) with exactly the data you >> would >> > > want to send to mesos-dns: rolename.environment.jobname.[tasknumber] >> for >> > > each task. Maybe you would need to invert the order of the fields, but >> > > that's pretty much the right thing. >> > > >> > > On Thu, Mar 31, 2016 at 12:53 PM Zhitao Li >> > wrote: >> > > >> > > > Hi Stephan, >> > > > >> > > > I like your proposal, but I think they all require some changes on >> > Mesos >> > > > DNS to support this level of customization. I've filed a github issue >> > to >> > > > mesos-dns to >> > > describe >> > > > what I want. >> > > > >> > > > I've updated my patch to include unit test and command flag switch, >> and >> > > > it's ready for review now. >> > > > >> > > > On Thu, Mar 31, 2016 at 2:32 AM, Erb, Stephan < >> > > Stephan.Erb@blue-yonder.com >> > > > > >> > > > wrote: >> > > > >> > > > > If I understand your example correctly, the underling jobkey used >> to >> > > > > generate "vagranttesthttp-exampled.twitterscheduler.mesos" was >> > > > > "vagrant/test/http-exampled" and what we actually put into the >> > > > > DiscoveryInfo is "vagrant.test.http-exampled". >> > > > > >> > > > > So how about: >> > > > > * we inject inverse names . So for >> > example: >> > > > > "http-exampled.test.vagrant" >> > > > > * we teach mesos-DNS that it should not silently drop dots in our >> > names >> > > > > >> > > > > That should provide us with hierarchical, collision free DNS names >> > such >> > > > as >> > > > > "http-exampled.test.vagrant.twitterscheduler.mesos". >> > > > > >> > > > > Bonus points if we get "twitterscheduler" replaced by the actual >> > > cluster >> > > > > name. >> > > > > >> > > > > ________________________________________ >> > > > > From: Zhitao Li >> > > > > Sent: Thursday, March 31, 2016 01:08 >> > > > > To: dev@aurora.apache.org >> > > > > Subject: Re: Populate DiscoveryInfo in Mesos >> > > > > >> > > > > On Wed, Mar 30, 2016 at 3:58 PM, Joshua Cohen >> > > wrote: >> > > > > >> > > > > > Job names are not unique though, what would happen if multiple >> jobs >> > > had >> > > > > the >> > > > > > same name (either across roles or across environments in the same >> > > > role)? >> > > > > > >> > > > > >> > > > > Good point. They would conflict with each other, and I guess in >> that >> > > case >> > > > > Mesos DNS should not be used with the cluster. >> > > > > >> > > > > An alternative is {role}-{job name}, although there are still ways >> to >> > > > > create conflict in such case (e.g. "role-dumy/test/job" and >> > > > > "role/test/dummy-job" generates the same name). >> > > > > >> > > > > I think the correct long term approach is to allow some way to >> > > configure >> > > > > this information by task or job. I'm a bit hesitant to include new >> > > thrift >> > > > > structures for this experiment, and maybe the idea of >> > > "TaskInfoDecorator" >> > > > > (see my previous posts) would be more flexible? >> > > > > >> > > > > >> > > > > > >> > > > > > On Wed, Mar 30, 2016 at 5:33 PM, Zhitao Li < >> zhitaoli.cs@gmail.com> >> > > > > wrote: >> > > > > > >> > > > > > > Stephan, >> > > > > > > >> > > > > > > So I've managed to run the official Mesos DNS docker container >> > > > > > > under the >> > Aurora >> > > > > > vagrant >> > > > > > > environment and get some SRV/A recorded pulled from Mesos >> master >> > > from >> > > > > > > Aurora. >> > > > > > > >> > > > > > > Because Mesos DNS uses 'name' field if set with some string >> > > > > manipulation, >> > > > > > > for the job 'vagrant/test/http_example_docker', my prototype >> > > > generates >> > > > > > > these DNS records: >> > > > > > > >> > > > > > > A record: vagranttesthttp-exampled.twitterscheduler.mesos >> > > > > > > SRV record: >> > _vagranttesthttp-exampled._tcp.twitterscheduler.mesos. >> > > > > > > >> > > > > > > If we want to make current prototype useful for Mesos DNS, I >> > > suggest >> > > > we >> > > > > > > change the name field to job name, which would generate record >> > > like: >> > > > > > > A: http_example_docker.twitterscheduler.mesos >> > > > > > > SRV: _http_example_docker._tcp.twitterscheduler.slave.mesos >> > > > > > > >> > > > > > > I'll update my patch after getting some signal from you. >> Thanks. >> > > > > > > >> > > > > > > On Fri, Mar 25, 2016 at 1:49 PM, Zhitao Li < >> > zhitaoli.cs@gmail.com> >> > > > > > wrote: >> > > > > > > >> > > > > > > > Hi Stephan, >> > > > > > > > >> > > > > > > > Thanks for looking at that prototype patch. >> > > > > > > > >> > > > > > > > I'll update the patch with the review comments, and probably >> > add >> > > a >> > > > > > global >> > > > > > > > flag of "populate_discovery_info" to toggle this behavior. >> > > > > > > > >> > > > > > > > About the optional fields: I think it'll be hard to come up a >> > > good >> > > > > set >> > > > > > of >> > > > > > > > rules applicable to all orgs using Aurora + Mesos, because >> > > cluster >> > > > > > > > management and service discovery stack could differ from org >> to >> > > > org. >> > > > > > > > >> > > > > > > > In a recent Mesos work group, some experience folks (Jie Yu >> and >> > > Ben >> > > > > > > > Mahler) mentioned some ideas of *TaskInfoDecorator, *which is >> > > some >> > > > > > > > optional and configurable plugin on Aurora scheduler side to >> > > allow >> > > > > > > operator >> > > > > > > > to set additional fields before sending the message to >> Mesos. I >> > > > like >> > > > > > such >> > > > > > > > idea because it would enable Aurora users to experiment >> faster. >> > > Do >> > > > > you >> > > > > > > > think this is an interesting idea worth pursuing? >> > > > > > > > >> > > > > > > > >> > > > > > > > On Fri, Mar 25, 2016 at 1:42 PM, Erb, Stephan < >> > > > > > > Stephan.Erb@blue-yonder.com >> > > > > > > > > wrote: >> > > > > > > > >> > > > > > > >> I had a closer look at the Mesos documentation, and a design >> > > > > document >> > > > > > > >> might be unnecessary. Most of the values are optional. We >> can >> > > > > > therefore >> > > > > > > >> leave them out until we have a proper usecase for them. >> > > > > > > >> >> > > > > > > >> I left a couple of comments in the review request. >> > > > > > > >> ________________________________________ >> > > > > > > >> From: Zhitao Li >> > > > > > > >> Sent: Tuesday, March 22, 2016 21:15 >> > > > > > > >> To: dev@aurora.apache.org >> > > > > > > >> Subject: Re: Populate DiscoveryInfo in Mesos >> > > > > > > >> >> > > > > > > >> Hi Stephan, >> > > > > > > >> >> > > > > > > >> Sorry for the delay on follow up on this. I took a quick >> look >> > at >> > > > > > Aurora >> > > > > > > >> code, and it's actually quite easy to pipe this information >> to >> > > > Mesos >> > > > > > > (see >> > > > > > > >> https://reviews.apache.org/r/45177/ for quick prototype). >> > > > > > > >> >> > > > > > > >> I'll take a stab to see how I can get Mesos-DNS to work with >> > > this >> > > > > > > >> prototype. >> > > > > > > >> >> > > > > > > >> IMO, if this is something the community is interested, the >> > main >> > > > > > > questions >> > > > > > > >> would be 1) how various fields would be mapped in different >> > > Aurora >> > > > > > > usages, >> > > > > > > >> and 2) to which level should opt-in/opt-out configured for >> > > > > populating >> > > > > > > such >> > > > > > > >> information. >> > > > > > > >> >> > > > > > > >> I actually don't have too much insights on how these usage >> > > > > conventions >> > > > > > > >> would be set (through command line of scheduler or job >> > > > > configuration?) >> > > > > > > >> >> > > > > > > >> Do you think a design doc is the best action here, or a more >> > > > > involved >> > > > > > > >> questionnaire about which fields would be useful for >> > community, >> > > or >> > > > > > what >> > > > > > > >> value they should take? >> > > > > > > >> >> > > > > > > >> On Mon, Mar 7, 2016 at 1:00 AM, Erb, Stephan < >> > > > > > > Stephan.Erb@blue-yonder.com >> > > > > > > >> > >> > > > > > > >> wrote: >> > > > > > > >> >> > > > > > > >> > That sounds like a good idea! Great. >> > > > > > > >> > >> > > > > > > >> > If you go ahead with this, please be so kind and start by >> > > > posting >> > > > > a >> > > > > > > >> short >> > > > > > > >> > design document here on mailinglist (similar to those here >> > > > > > > >> > >> > > > > > >> > > https://github.com/apache/aurora/blob/master/docs/design-documents.md >> > > > > > > , >> > > > > > > >> > but probably shorter). >> > > > > > > >> > >> > > > > > > >> > This will allow us to split the discussion of the design >> > from >> > > > > > > discussing >> > > > > > > >> > the actual implementation. I believe this is necessary, as >> > the >> > > > > > > >> > DiscoveryInfo protocol is quite flexible ( >> > > > > > > >> > >> > > > > > > >> >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> http://mesos.apache.org/documentation/latest/app-framework-development-guide/ >> > > > > > > >> > ). >> > > > > > > >> > >> > > > > > > >> > Thanks, >> > > > > > > >> > Stephan >> > > > > > > >> > >> > > > > > > >> > >> > > > > > > >> > ________________________________________ >> > > > > > > >> > From: Zhitao Li >> > > > > > > >> > Sent: Monday, March 7, 2016 00:05 >> > > > > > > >> > To: dev@aurora.apache.org >> > > > > > > >> > Subject: Populate DiscoveryInfo in Mesos >> > > > > > > >> > >> > > > > > > >> > Hi, >> > > > > > > >> > >> > > > > > > >> > It seems like Aurora does not populate the "discovery" >> field >> > > in >> > > > > > either >> > > > > > > >> > TaskInfo or ExecutorInfo in mesos.proto >> > > > > > > >> > < >> > > > > > > >> > >> > > > > > > >> >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> https://github.com/apache/mesos/blob/master/include/mesos/mesos.proto#L438 >> > > > > > > >> > > >> > > > > > > >> > . >> > > > > > > >> > >> > > > > > > >> > I'm considering adding this to support retrieving port map >> > in >> > > > > Mesos >> > > > > > > >> > directly. This would enable us to discovery this >> information >> > > > > > directly >> > > > > > > >> from >> > > > > > > >> > Mesos side, and also enables us to build one universal >> > service >> > > > > > > discovery >> > > > > > > >> > solution for multiple frameworks including Aurora. >> > > > > > > >> > >> > > > > > > >> > If no objection, I'll create a JIRA ticket for this task. >> > > > > > > >> > >> > > > > > > >> > Thanks. >> > > > > > > >> > -- >> > > > > > > >> > Cheers, >> > > > > > > >> > >> > > > > > > >> > Zhitao Li >> > > > > > > >> > >> > > > > > > >> >> > > > > > > >> >> > > > > > > >> >> > > > > > > >> -- >> > > > > > > >> Cheers, >> > > > > > > >> >> > > > > > > >> Zhitao Li >> > > > > > > >> >> > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > > -- >> > > > > > > > Cheers, >> > > > > > > > >> > > > > > > > Zhitao Li >> > > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > -- >> > > > > > > Cheers, >> > > > > > > >> > > > > > > Zhitao Li >> > > > > > > >> > > > > > >> > > > > >> > > > > >> > > > > >> > > > > -- >> > > > > Cheers, >> > > > > >> > > > > Zhitao Li >> > > > > >> > > > >> > > > >> > > > >> > > > -- >> > > > Cheers, >> > > > >> > > > Zhitao Li >> > > > >> > > >> > >> > >> > >> > -- >> > Cheers, >> > >> > Zhitao Li >> > >> > -- >> > Zameer Manji >> > >> > >> > > > > -- > Cheers, > > Zhitao Li