Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 74161200AF7 for ; Tue, 14 Jun 2016 23:35:55 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 714C4160A64; Tue, 14 Jun 2016 21:35:55 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 1CD86160A06 for ; Tue, 14 Jun 2016 23:35:52 +0200 (CEST) Received: (qmail 18528 invoked by uid 500); 14 Jun 2016 21:35:52 -0000 Mailing-List: contact commits-help@aurora.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@aurora.apache.org Delivered-To: mailing list commits@aurora.apache.org Received: (qmail 18519 invoked by uid 99); 14 Jun 2016 21:35:52 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 14 Jun 2016 21:35:52 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id BC29B1A0CC5 for ; Tue, 14 Jun 2016 21:35:51 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 0.374 X-Spam-Level: X-Spam-Status: No, score=0.374 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, KAM_LAZY_DOMAIN_SECURITY=1, RP_MATCHES_RCVD=-1.426] autolearn=disabled Received: from mx2-lw-us.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id j59W-fPAS369 for ; Tue, 14 Jun 2016 21:35:35 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx2-lw-us.apache.org (ASF Mail Server at mx2-lw-us.apache.org) with ESMTP id 88CDE5FBF9 for ; Tue, 14 Jun 2016 21:35:35 +0000 (UTC) Received: from svn01-us-west.apache.org (svn.apache.org [10.41.0.6]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id A8D47E5ABE for ; Tue, 14 Jun 2016 21:35:33 +0000 (UTC) Received: from svn01-us-west.apache.org (localhost [127.0.0.1]) by svn01-us-west.apache.org (ASF Mail Server at svn01-us-west.apache.org) with ESMTP id A6B483A0096 for ; Tue, 14 Jun 2016 21:35:33 +0000 (UTC) Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Subject: svn commit: r1748470 [15/19] - in /aurora/site: data/ publish/ publish/blog/ publish/blog/aurora-0-14-0-released/ publish/documentation/0.10.0/ publish/documentation/0.10.0/build-system/ publish/documentation/0.10.0/client-cluster-configuration/ publis... Date: Tue, 14 Jun 2016 21:35:30 -0000 To: commits@aurora.apache.org From: serb@apache.org X-Mailer: svnmailer-1.0.9 Message-Id: <20160614213533.A6B483A0096@svn01-us-west.apache.org> archived-at: Tue, 14 Jun 2016 21:35:55 -0000 Added: aurora/site/source/blog/2016-06-14-aurora-0-14-0-released.md URL: http://svn.apache.org/viewvc/aurora/site/source/blog/2016-06-14-aurora-0-14-0-released.md?rev=1748470&view=auto ============================================================================== --- aurora/site/source/blog/2016-06-14-aurora-0-14-0-released.md (added) +++ aurora/site/source/blog/2016-06-14-aurora-0-14-0-released.md Tue Jun 14 21:35:25 2016 @@ -0,0 +1,89 @@ +--- +layout: post +title: 0.14.0 Released +permalink: /blog/aurora-0-14-0-released/ +published: true +post_author: + display_name: Stephan Erb + twitter: ErbStephan +tags: Release +--- + +The latest Apache Aurora release, 0.14.0, is now available for +[download](http://aurora.apache.org/downloads/). Here are some highlights in this release: + + - Upgraded Mesos to 0.27.2 + - Added a new optional [Apache Curator](https://curator.apache.org/) backend for performing + scheduler leader election. You can enable this with the new `-zk_use_curator` scheduler argument. + - Adding --nosetuid-health-checks flag to control whether the executor runs health checks as the + job's role's user. + - New scheduler command line argument `-offer_filter_duration` to control the time after which we + expect Mesos to re-offer unused resources. A short duration improves scheduling performance in + smaller clusters, but might lead to resource starvation for other frameworks if you run multiple + ones in your cluster. Uses the Mesos default of 5s. + - New scheduler command line option `-framework_name` to change the name used for registering + the Aurora framework with Mesos. The current default value is 'TwitterScheduler'. + - Added experimental support for launching tasks using filesystem images and the Mesos [unified + containerizer](https://github.com/apache/mesos/blob/master/docs/container-image.md). See that + linked documentation for details on configuring Mesos to use the unified containerizer. Note that + earlier versions of Mesos do not fully support the unified containerizer. Mesos 0.28.x or later is + recommended for anyone adopting task images via the Mesos containerizer. + - Upgraded to pystachio 0.8.1 to pick up support for the new [Choice type](https://github.com/wickman/pystachio/blob/v0.8.1/README.md#choices). + - The `container` property of a `Job` is now a Choice of either a `Container` holder, or a direct + reference to either a `Docker` or `Mesos` container. + - New scheduler command line argument `-ip` to control what ip address to bind the schedulers http + server to. + - Added experimental support for Mesos GPU resource. This feature will be available in Mesos 1.0 + and is disabled by default. Use `-allow_gpu_resource` flag to enable it. Once this feature is + enabled, creating jobs with GPU resource will make scheduler snapshot backwards incompatible. + For further further details, please see the full release notes. + - Experimental support for a webhook feature which POSTs all task state changes to a user defined + endpoint. + - Added support for specifying the default tier name in tier configuration file (`tiers.json`). The + `default` property is required and is initialized with the `preemptible` tier (`preemptible` tier + tasks can be preempted but their resources cannot be revoked). + +Deprecations and removals: + + - Deprecated `--restart-threshold` option in the `aurora job restart` command to match the job + updater behavior. This option has no effect now and will be removed in the future release. + - Deprecated `-framework_name` default argument 'TwitterScheduler'. In a future release this + will change to 'aurora'. Please be aware that depending on your usage of Mesos, this will + be a backward incompatible change. For details, see MESOS-703. + - The `-thermos_observer_root` command line arg has been removed from the scheduler. This was a + relic from the time when executor checkpoints were written globally, rather than into a task's + sandbox. + - Setting the `container` property of a `Job` to a `Container` holder is deprecated in favor of + setting it directly to the appropriate (i.e. `Docker` or `Mesos`) container type. + - Deprecated `numCpus`, `ramMb` and `diskMb` fields in `TaskConfig` and `ResourceAggregate` thrift + structs. Use `set resources` to specify task resources or quota values. + - The endpoint `/slaves` is deprecated. Please use `/agents` instead. + - Deprecated `production` field in `TaskConfig` thrift struct. Use `tier` field to specify task + scheduling and resource handling behavior. + - The scheduler `resources_*_ram_gb` and `resources_*_disk_gb` metrics have been renamed to + `resources_*_ram_mb` and `resources_*_disk_mb` respectively. Note the unit change: GB -> MB. + +Full release notes are available in the release +[CHANGELOG](https://git-wip-us.apache.org/repos/asf?p=aurora.git&f=CHANGELOG&hb=rel/0.14.0). + +## Getting Involved + +We encourage you to try out this release and let us know what you think. If you run into any issues, +please let us know on the [user mailing list and IRC](https://aurora.apache.org/community/). The +community also holds weekly IRC meetings at 11AM Pacific every Monday that you are welcome to join. + +## Thanks + +Thanks to the 11 contributors who made Apache Aurora 0.14.0 possible: + +* Chris Bannister +* Dmitriy Shirchenko +* John Sirois +* Joshua Cohen +* Maxim Khutornenko +* Mehrdad Nurolahzade +* Raymond Khalife +* Renan DelValle +* Stephan Erb +* Zameer Manji +* se choi Added: aurora/site/source/documentation/0.14.0/additional-resources/presentations.md URL: http://svn.apache.org/viewvc/aurora/site/source/documentation/0.14.0/additional-resources/presentations.md?rev=1748470&view=auto ============================================================================== --- aurora/site/source/documentation/0.14.0/additional-resources/presentations.md (added) +++ aurora/site/source/documentation/0.14.0/additional-resources/presentations.md Tue Jun 14 21:35:25 2016 @@ -0,0 +1,80 @@ +# Apache Aurora Presentations +Video and slides from presentations and panel discussions about Apache Aurora. + +_(Listed in date descending order)_ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Mesos and Aurora on a Small Scale ThumbnailMesos & Aurora on a Small Scale (Video) +

Presented by Florian Pfeiffer

+

October 8, 2015 at #MesosCon Europe 2015

SLA Aware Maintenance for Operators ThumbnailSLA Aware Maintenance for Operators (Video) +

Presented by Joe Smith

+

October 8, 2015 at #MesosCon Europe 2015

Shipping Code with Aurora ThumbnailShipping Code with Aurora (Video) +

Presented by Bill Farner

+

August 20, 2015 at #MesosCon 2015

Twitter Production Scale ThumbnailTwitter’s Production Scale: Mesos and Aurora Operations (Video) +

Presented by Joe Smith

+

August 20, 2015 at #MesosCon 2015

From Monolith to Microservices with Aurora Video ThumbnailFrom Monolith to Microservices w/ Aurora (Video) +

Presented by Thanos Baskous, Tony Dong, Dobromir Montauk

+

April 30, 2015 at Bay Area Apache Aurora Users Group

Aurora + Mesos in Practice at Twitter ThumbnailAurora + Mesos in Practice at Twitter (Video) +

Presented by Bill Farner

+

March 07, 2015 at Bigcommerce TechTalk

Apache Auroraの始めかた Slideshow ThumbnailApache Auroraの始めかた (Slides) +

Presented by Masahito Zembutsu

+

February 28, 2015 at Open Source Conference 2015 Tokyo Spring

Apache Aurora Adopters Panel Video ThumbnailApache Aurora Adopters Panel (Video) +

Panelists Ben Staffin, Josh Adams, Bill Farner, Berk Demir

+

February 19, 2015 at Bay Area Mesos Users Group

Operating Apache Aurora and Mesos at Twitter Video ThumbnailOperating Apache Aurora and Mesos at Twitter (Video) +

Presented by Joe Smith

+

February 19, 2015 at Bay Area Mesos Users Group

Apache Aurora and Mesos at TellApartApache Aurora and Mesos at TellApart (Video) +

Presented by Steve Niemitz

+

February 19, 2015 at Bay Area Mesos Users Group

Past, Present, and Future of the Aurora Scheduler Video ThumbnailPast, Present, and Future of the Aurora Scheduler (Video) +

Presented by Bill Farner

+

August 21, 2014 at #MesosCon 2014

Introduction to Apache Aurora Video ThumbnailIntroduction to Apache Aurora (Video) +

Presented by Bill Farner

+

March 25, 2014 at Aurora and Mesos Frameworks Meetup

Added: aurora/site/source/documentation/0.14.0/additional-resources/tools.md URL: http://svn.apache.org/viewvc/aurora/site/source/documentation/0.14.0/additional-resources/tools.md?rev=1748470&view=auto ============================================================================== --- aurora/site/source/documentation/0.14.0/additional-resources/tools.md (added) +++ aurora/site/source/documentation/0.14.0/additional-resources/tools.md Tue Jun 14 21:35:25 2016 @@ -0,0 +1,21 @@ +# Tools + +Various tools integrate with Aurora. Is there a tool missing? Let us know, or submit a patch to add it! + +* Load-balancing technology used to direct traffic to services running on Aurora: + - [synapse](https://github.com/airbnb/synapse) based on HAProxy + - [aurproxy](https://github.com/tellapart/aurproxy) based on nginx + - [jobhopper](https://github.com/benley/aurora-jobhopper) performs HTTP redirects for easy developer and administrator access + +* RPC libraries that integrate with the Aurora's [service discovery mechanism](../../features/service-discovery/): + - [linkerd](https://linkerd.io/) RPC proxy + - [finagle](https://twitter.github.io/finagle) (Scala) + - [scales](https://github.com/steveniemitz/scales) (Python) + +* Monitoring: + - [collectd-aurora](https://github.com/zircote/collectd-aurora) for cluster monitoring using collectd + - [Prometheus Aurora exporter](https://github.com/tommyulfsparre/aurora_exporter) for cluster monitoring using Prometheus + - [Prometheus service discovery integration](http://prometheus.io/docs/operating/configuration/#zookeeper-serverset-sd-configurations-serverset_sd_config) for discovering and monitoring services running on Aurora + +* Packaging and deployment: + - [aurora-packaging](https://github.com/apache/aurora-packaging), the source of the official Aurora packages Added: aurora/site/source/documentation/0.14.0/contributing.md URL: http://svn.apache.org/viewvc/aurora/site/source/documentation/0.14.0/contributing.md?rev=1748470&view=auto ============================================================================== --- aurora/site/source/documentation/0.14.0/contributing.md (added) +++ aurora/site/source/documentation/0.14.0/contributing.md Tue Jun 14 21:35:25 2016 @@ -0,0 +1,91 @@ +## Get the Source Code + +First things first, you'll need the source! The Aurora source is available from Apache git: + + git clone https://git-wip-us.apache.org/repos/asf/aurora + +Read the Style Guides +--------------------- +Aurora's codebase is primarily Java and Python and conforms to the Twitter Commons styleguides for +both languages. + +- [Java Style Guide](https://github.com/twitter/commons/blob/master/src/java/com/twitter/common/styleguide.md) +- [Python Style Guide](https://github.com/twitter/commons/blob/master/src/python/twitter/common/styleguide.md) + +## Find Something to Do + +There are issues in [Jira](https://issues.apache.org/jira/browse/AURORA) with the +["newbie" label](https://issues.apache.org/jira/issues/?jql=project%20%3D%20AURORA%20AND%20labels%20%3D%20newbie%20and%20resolution%3Dunresolved) +that are good starting places for new Aurora contributors; pick one of these and dive in! Once +you've got a patch, the next step is to post a review. + +## Getting your ReviewBoard Account + +Go to https://reviews.apache.org and create an account. + +## Setting up your ReviewBoard Environment + +Run `./rbt status`. The first time this runs it will bootstrap and you will be asked to login. +Subsequent runs will cache your login credentials. + +## Submitting a Patch for Review + +Post a review with `rbt`, fill out the fields in your browser and hit Publish. + + ./rbt post -o + +If you're unsure about who to add as a reviewer, you can default to adding Bill Farner (wfarner) and +Joshua Cohen (jcohen). They will take care of finding an appropriate reviewer for the patch. + +Once you've done this, you probably want to mark the associated Jira issue as Reviewable. + +## Updating an Existing Review + +Incorporate review feedback, make some more commits, update your existing review, fill out the +fields in your browser and hit Publish. + + ./rbt post -o -r + +## Getting Your Review Merged + +If you're not an Aurora committer, one of the committers will merge your change in as described +below. Generally, the last reviewer to give the review a 'Ship It!' will be responsible. + +### Merging Your Own Review (Committers) + +Once you have shipits from the right committers, merge your changes in a single commit and mark +the review as submitted. The typical workflow is: + + git checkout master + git pull origin master + ./rbt patch -c # Verify the automatically-generated commit message looks sane, + # editing if necessary. + git show master # Verify everything looks sane + git push origin master + ./rbt close + +Note that even if you're developing using feature branches you will not use `git merge` - each +commit will be an atomic change accompanied by a ReviewBoard entry. + +### Merging Someone Else's Review + +Sometimes you'll need to merge someone else's RB. The typical workflow for this is + + git checkout master + git pull origin master + ./rbt patch -c + git show master # Verify everything looks sane, author is correct + git push origin master + +Note for committers: while we generally use the commit message generated by `./rbt patch` some +changes are often required: + +1. Ensure the the commit message does not exceed 100 characters per line. +2. Remove the "Testing Done" section. It's generally redundant (can be seen by checking the linked + review) or entirely irrelevant to the commit itself. + +## Cleaning Up + +Your patch has landed, congratulations! The last thing you'll want to do before moving on to your +next fix is to clean up your Jira and Reviewboard. The former of which should be marked as +"Resolved" while the latter should be marked as "Submitted". Added: aurora/site/source/documentation/0.14.0/development/client.md URL: http://svn.apache.org/viewvc/aurora/site/source/documentation/0.14.0/development/client.md?rev=1748470&view=auto ============================================================================== --- aurora/site/source/documentation/0.14.0/development/client.md (added) +++ aurora/site/source/documentation/0.14.0/development/client.md Tue Jun 14 21:35:25 2016 @@ -0,0 +1,81 @@ +Developing the Aurora Client +============================ + +The client is written in Python, and uses the +[Pants](http://pantsbuild.github.io/python-readme.html) build tool. + + +Building and Testing +-------------------- + +Building and testing the client code are both done using Pants. The relevant targets to know about +are: + + * Build a client executable: `./pants binary src/main/python/apache/aurora/client:aurora` + * Test client code: `./pants test src/test/python/apache/aurora/client/cli:cli` + +If you want to build a source distribution of the client, you need to run `./build-support/release/make-python-sdists`. + + +Running/Debugging +------------------ + +For manually testing client changes against a cluster, we use [Vagrant](https://www.vagrantup.com/). +To start a virtual cluster, you need to install Vagrant, and then run `vagrant up` for the root of +the aurora workspace. This will create a vagrant host named "devcluster", with a Mesos master, a set +of Mesos agents, and an Aurora scheduler. + +If you have a change you would like to test in your local cluster, you'll rebuild the client: + + vagrant ssh -c 'aurorabuild client' + +Once this completes, the `aurora` command will reflect your changes. + + +Running/Debugging in PyCharm +----------------------------- + +It's possible to use PyCharm to run and debug both the client and client tests in an IDE. In order +to do this, first run: + + build-support/python/make-pycharm-virtualenv + +This script will configure a virtualenv with all of our Python requirements. Once the script +completes it will emit instructions for configuring PyCharm: + + Your PyCharm environment is now set up. You can open the project root + directory with PyCharm. + + Once the project is loaded: + - open project settings + - click 'Project Interpreter' + - click the cog in the upper-right corner + - click 'Add Local' + - select 'build-support/python/pycharm.venv/bin/python' + - click 'OK' + +### Running/Debugging Tests +After following these instructions, you should now be able to run/debug tests directly from the IDE +by right-clicking on a test (or test class) and choosing to run or debug: + +[![Debug Client Test](../images/debug-client-test.png)](../images/debug-client-test.png) + +If you've set a breakpoint, you can see the run will now stop and let you debug: + +[![Debugging Client Test](../images/debugging-client-test.png)](../images/debugging-client-test.png) + +### Running/Debugging the Client +Actually running and debugging the client is unfortunately a bit more complex. You'll need to create +a Run configuration: + +* Go to Run → Edit Configurations +* Click the + icon to add a new configuration. +* Choose python and name the configuration 'client'. +* Set the script path to `/your/path/to/aurora/src/main/python/apache/aurora/client/cli/client.py` +* Set the script parameters to the command you want to run (e.g. `job status `) +* Expand the Environment section and click the ellipsis to add a new environment variable +* Click the + at the bottom to add a new variable named AURORA_CONFIG_ROOT whose value is the + path where the your cluster configuration can be found. For example, to talk to the scheduler + running in the vagrant image, it would be set to `/your/path/to/aurora/examples/vagrant` (this + is the directory where our example clusters.json is found). +* You should now be able to run and debug this configuration! Added: aurora/site/source/documentation/0.14.0/development/committers-guide.md URL: http://svn.apache.org/viewvc/aurora/site/source/documentation/0.14.0/development/committers-guide.md?rev=1748470&view=auto ============================================================================== --- aurora/site/source/documentation/0.14.0/development/committers-guide.md (added) +++ aurora/site/source/documentation/0.14.0/development/committers-guide.md Tue Jun 14 21:35:25 2016 @@ -0,0 +1,86 @@ +Committer's Guide +================= + +Information for official Apache Aurora committers. + +Setting up your email account +----------------------------- +Once your Apache ID has been set up you can configure your account and add ssh keys and setup an +email forwarding address at + + http://id.apache.org + +Additional instructions for setting up your new committer email can be found at + + http://www.apache.org/dev/user-email.html + +The recommended setup is to configure all services (mailing lists, JIRA, ReviewBoard) to send +emails to your @apache.org email address. + + +Creating a gpg key for releases +------------------------------- +In order to create a release candidate you will need a gpg key published to an external key server +and that key will need to be added to our KEYS file as well. + +1. Create a key: + + gpg --gen-key + +2. Add your gpg key to the Apache Aurora KEYS file: + + git clone https://git-wip-us.apache.org/repos/asf/aurora.git + (gpg --list-sigs && gpg --armor --export ) >> KEYS + git add KEYS && git commit -m "Adding gpg key for " + ./rbt post -o -g + +3. Publish the key to an external key server: + + gpg --keyserver pgp.mit.edu --send-keys + +4. Update the changes to the KEYS file to the Apache Aurora svn dist locations listed below: + + https://dist.apache.org/repos/dist/dev/aurora/KEYS + https://dist.apache.org/repos/dist/release/aurora/KEYS + +5. Add your key to git config for use with the release scripts: + + git config --global user.signingkey + + +Creating a release +------------------ +The following will guide you through the steps to create a release candidate, vote, and finally an +official Apache Aurora release. Before starting your gpg key should be in the KEYS file and you +must have access to commit to the dist.a.o repositories. + +1. Ensure that all issues resolved for this release candidate are tagged with the correct Fix +Version in JIRA, the changelog script will use this to generate the CHANGELOG in step #2. + +2. Create a release candidate. This will automatically update the CHANGELOG and commit it, create a +branch and update the current version within the trunk. To create a minor version update and publish +it run + + ./build-support/release/release-candidate -l m -p + +3. Update, if necessary, the draft email created from the `release-candidate` script in step #2 and +send the [VOTE] email to the dev@ mailing list. You can verify the release signature and checksums +by running + + ./build-support/release/verify-release-candidate + +4. Wait for the vote to complete. If the vote fails close the vote by replying to the initial [VOTE] +email sent in step #3 by editing the subject to [RESULT][VOTE] ... and noting the failure reason +(example [here](http://markmail.org/message/d4d6xtvj7vgwi76f)). Now address any issues and go back to +step #1 and run again, this time you will use the -r flag to increment the release candidate +version. This will automatically clean up the release candidate rc0 branch and source distribution. + + ./build-support/release/release-candidate -l m -r 1 -p + +5. Once the vote has successfully passed create the release + + ./build-support/release/release + +6. Update the draft email created fom the `release` script in step #5 to include the Apache ID's for +all binding votes and send the [RESULT][VOTE] email to the dev@ mailing list. + Added: aurora/site/source/documentation/0.14.0/development/db-migration.md URL: http://svn.apache.org/viewvc/aurora/site/source/documentation/0.14.0/development/db-migration.md?rev=1748470&view=auto ============================================================================== --- aurora/site/source/documentation/0.14.0/development/db-migration.md (added) +++ aurora/site/source/documentation/0.14.0/development/db-migration.md Tue Jun 14 21:35:25 2016 @@ -0,0 +1,34 @@ +DB Migrations +============= + +Changes to the DB schema should be made in the form of migrations. This ensures that all changes +are applied correctly after a DB dump from a previous version is restored. + +DB migrations are managed through a system built on top of +[MyBatis Migrations](http://www.mybatis.org/migrations/). The migrations are run automatically when +a snapshot is restored, no manual interaction is required by cluster operators. + +Upgrades +-------- +When adding or altering tables or changing data, in addition to making to change in +[schema.sql](../../src/main/resources/org/apache/aurora/scheduler/storage/db/schema.sql), a new +migration class should be created under the org.apache.aurora.scheduler.storage.db.migration +package. The class should implement the [MigrationScript](https://github.com/mybatis/migrations/blob/master/src/main/java/org/apache/ibatis/migration/MigrationScript.java) +interface (see [V001_TestMigration](https://github.com/apache/aurora/blob/rel/0.14.0/src/test/java/org/apache/aurora/scheduler/storage/db/testmigration/V001_TestMigration.java) +as an example). The upgrade and downgrade scripts are defined in this class. When restoring a +snapshot the list of migrations on the classpath is compared to the list of applied changes in the +DB. Any changes that have not yet been applied are executed and their downgrade script is stored +alongside the changelog entry in the database to faciliate downgrades in the event of a rollback. + +Downgrades +---------- +If, while running migrations, a rollback is detected, i.e. a change exists in the DB changelog that +does not exist on the classpath, the downgrade script associated with each affected change is +applied. + +Baselines +--------- +After enough time has passed (at least 1 official release), it should be safe to baseline migrations +if desired. This can be accomplished by ensuring the changes from migrations have been applied to +[schema.sql](../../src/main/resources/org/apache/aurora/scheduler/storage/db/schema.sql) and then +removing the corresponding migration classes and adding a migration to remove the changelog entries. \ No newline at end of file Added: aurora/site/source/documentation/0.14.0/development/design-documents.md URL: http://svn.apache.org/viewvc/aurora/site/source/documentation/0.14.0/development/design-documents.md?rev=1748470&view=auto ============================================================================== --- aurora/site/source/documentation/0.14.0/development/design-documents.md (added) +++ aurora/site/source/documentation/0.14.0/development/design-documents.md Tue Jun 14 21:35:25 2016 @@ -0,0 +1,21 @@ +Design Documents +================ + +Since its inception as an Apache project, larger feature additions to the +Aurora code base are discussed in form of design documents. Design documents +are living documents until a consensus has been reached to implement a feature +in the proposed form. + +Current and past documents: + +* [Command Hooks for the Aurora Client](../design/command-hooks/) +* [GPU Resources in Aurora](https://docs.google.com/document/d/1J9SIswRMpVKQpnlvJAMAJtKfPP7ZARFknuyXl-2aZ-M/edit) +* [Health Checks for Updates](https://docs.google.com/document/d/1ZdgW8S4xMhvKW7iQUX99xZm10NXSxEWR0a-21FP5d94/edit) +* [JobUpdateDiff thrift API](https://docs.google.com/document/d/1Fc_YhhV7fc4D9Xv6gJzpfooxbK4YWZcvzw6Bd3qVTL8/edit) +* [REST API RFC](https://docs.google.com/document/d/11_lAsYIRlD5ETRzF2eSd3oa8LXAHYFD8rSetspYXaf4/edit) +* [Revocable Mesos offers in Aurora](https://docs.google.com/document/d/1r1WCHgmPJp5wbrqSZLsgtxPNj3sULfHrSFmxp2GyPTo/edit) +* [Supporting the Mesos Universal Containerizer](https://docs.google.com/document/d/111T09NBF2zjjl7HE95xglsDpRdKoZqhCRM5hHmOfTLA/edit?usp=sharing) +* [Tier Management In Apache Aurora](https://docs.google.com/document/d/1erszT-HsWf1zCIfhbqHlsotHxWUvDyI2xUwNQQQxLgs/edit?usp=sharing) +* [Ubiquitous Jobs](https://docs.google.com/document/d/12hr6GnUZU3mc7xsWRzMi3nQILGB-3vyUxvbG-6YmvdE/edit) + +Design documents can be found in the Aurora issue tracker via the query [`project = AURORA AND text ~ "docs.google.com" ORDER BY created`](https://issues.apache.org/jira/browse/AURORA-1528?jql=project%20%3D%20AURORA%20AND%20text%20~%20%22docs.google.com%22%20ORDER%20BY%20created). Added: aurora/site/source/documentation/0.14.0/development/design/command-hooks.md URL: http://svn.apache.org/viewvc/aurora/site/source/documentation/0.14.0/development/design/command-hooks.md?rev=1748470&view=auto ============================================================================== --- aurora/site/source/documentation/0.14.0/development/design/command-hooks.md (added) +++ aurora/site/source/documentation/0.14.0/development/design/command-hooks.md Tue Jun 14 21:35:25 2016 @@ -0,0 +1,102 @@ +# Command Hooks for the Aurora Client + +## Introduction/Motivation + +We've got hooks in the client that surround API calls. These are +pretty awkward, because they don't correlate with user actions. For +example, suppose we wanted a policy that said users weren't allowed to +kill all instances of a production job at once. + +Right now, all that we could hook would be the "killJob" api call. But +kill (at least in newer versions of the client) normally runs in +batches. If a user called killall, what we would see on the API level +is a series of "killJob" calls, each of which specified a batch of +instances. We woudn't be able to distinguish between really killing +all instances of a job (which is forbidden under this policy), and +carefully killing in batches (which is permitted.) In each case, the +hook would just see a series of API calls, and couldn't find out what +the actual command being executed was! + +For most policy enforcement, what we really want to be able to do is +look at and vet the commands that a user is performing, not the API +calls that the client uses to implement those commands. + +So I propose that we add a new kind of hooks, which surround noun/verb +commands. A hook will register itself to handle a collection of (noun, +verb) pairs. Whenever any of those noun/verb commands are invoked, the +hooks methods will be called around the execution of the verb. A +pre-hook will have the ability to reject a command, preventing the +verb from being executed. + +## Registering Hooks + +These hooks will be registered via configuration plugins. A configuration plugin +can register hooks using an API. Hooks registered this way are, effectively, +hardwired into the client executable. + +The order of execution of hooks is unspecified: they may be called in +any order. There is no way to guarantee that one hook will execute +before some other hook. + + +### Global Hooks + +Commands registered by the python call are called _global_ hooks, +because they will run for all configurations, whether or not they +specify any hooks in the configuration file. + +In the implementation, hooks are registered in the module +`apache.aurora.client.cli.command_hooks`, using the class +`GlobalCommandHookRegistry`. A global hook can be registered by calling +`GlobalCommandHookRegistry.register_command_hook` in a configuration plugin. + +### The API + + class CommandHook(object) + @property + def name(self): + """Returns a name for the hook." + + def get_nouns(self): + """Return the nouns that have verbs that should invoke this hook.""" + + def get_verbs(self, noun): + """Return the verbs for a particular noun that should invoke his hook.""" + + @abstractmethod + def pre_command(self, noun, verb, context, commandline): + """Execute a hook before invoking a verb. + * noun: the noun being invoked. + * verb: the verb being invoked. + * context: the context object that will be used to invoke the verb. + The options object will be initialized before calling the hook + * commandline: the original argv collection used to invoke the client. + Returns: True if the command should be allowed to proceed; False if the command + should be rejected. + """ + + def post_command(self, noun, verb, context, commandline, result): + """Execute a hook after invoking a verb. + * noun: the noun being invoked. + * verb: the verb being invoked. + * context: the context object that will be used to invoke the verb. + The options object will be initialized before calling the hook + * commandline: the original argv collection used to invoke the client. + * result: the result code returned by the verb. + Returns: nothing + """ + + class GlobalCommandHookRegistry(object): + @classmethod + def register_command_hook(self, hook): + pass + +### Skipping Hooks + +To skip a hook, a user uses a command-line option, `--skip-hooks`. The option can either +specify specific hooks to skip, or "all": + +* `aurora --skip-hooks=all job create east/bozo/devel/myjob` will create a job + without running any hooks. +* `aurora --skip-hooks=test,iq create east/bozo/devel/myjob` will create a job, + and will skip only the hooks named "test" and "iq". Added: aurora/site/source/documentation/0.14.0/development/scheduler.md URL: http://svn.apache.org/viewvc/aurora/site/source/documentation/0.14.0/development/scheduler.md?rev=1748470&view=auto ============================================================================== --- aurora/site/source/documentation/0.14.0/development/scheduler.md (added) +++ aurora/site/source/documentation/0.14.0/development/scheduler.md Tue Jun 14 21:35:25 2016 @@ -0,0 +1,118 @@ +Developing the Aurora Scheduler +=============================== + +The Aurora scheduler is written in Java code and built with [Gradle](http://gradle.org). + + +Prerequisite +============ + +When using Apache Aurora checked out from the source repository or the binary +distribution, the Gradle wrapper and JavaScript dependencies are provided. +However, you need to manually install them when using the source release +downloads: + +1. Install Gradle following the instructions on the [Gradle web site](http://gradle.org) +2. From the root directory of the Apache Aurora project generate the Gradle +wrapper by running: + + gradle wrapper + + +Getting Started +=============== + +You will need Java 8 installed and on your `PATH` or unzipped somewhere with `JAVA_HOME` set. Then + + ./gradlew tasks + +will bootstrap the build system and show available tasks. This can take a while the first time you +run it but subsequent runs will be much faster due to cached artifacts. + +Running the Tests +----------------- +Aurora has a comprehensive unit test suite. To run the tests use + + ./gradlew build + +Gradle will only re-run tests when dependencies of them have changed. To force a re-run of all +tests use + + ./gradlew clean build + +Running the build with code quality checks +------------------------------------------ +To speed up development iteration, the plain gradle commands will not run static analysis tools. +However, you should run these before posting a review diff, and **always** run this before pushing a +commit to origin/master. + + ./gradlew build -Pq + +Running integration tests +------------------------- +To run the same tests that are run in the Apache Aurora continuous integration +environment: + + ./build-support/jenkins/build.sh + +In addition, there is an end-to-end test that runs a suite of aurora commands +using a virtual cluster: + + ./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh + +Creating a bundle for deployment +-------------------------------- +Gradle can create a zip file containing Aurora, all of its dependencies, and a launch script with + + ./gradlew distZip + +or a tar file containing the same files with + + ./gradlew distTar + +The output file will be written to `dist/distributions/aurora-scheduler.zip` or +`dist/distributions/aurora-scheduler.tar`. + + + +Developing Aurora Java code +=========================== + +Setting up an IDE +----------------- +Gradle can generate project files for your IDE. To generate an IntelliJ IDEA project run + + ./gradlew idea + +and import the generated `aurora.ipr` file. + +Adding or Upgrading a Dependency +-------------------------------- +New dependencies can be added from Maven central by adding a `compile` dependency to `build.gradle`. +For example, to add a dependency on `com.example`'s `example-lib` 1.0 add this block: + + compile 'com.example:example-lib:1.0' + +NOTE: Anyone thinking about adding a new dependency should first familiarize themselves with the +Apache Foundation's third-party licensing +[policy](http://www.apache.org/legal/resolved.html#category-x). + + + +Developing the Aurora Build System +================================== + +Bootstrapping Gradle +-------------------- +The following files were autogenerated by `gradle wrapper` using gradle's +[Wrapper](http://www.gradle.org/docs/current/dsl/org.gradle.api.tasks.wrapper.Wrapper.html) plugin and +should not be modified directly: + + ./gradlew + ./gradlew.bat + ./gradle/wrapper/gradle-wrapper.jar + ./gradle/wrapper/gradle-wrapper.properties + +To upgrade Gradle unpack the new version somewhere, run `/path/to/new/gradle wrapper` in the +repository root and commit the changed files. + Added: aurora/site/source/documentation/0.14.0/development/thermos.md URL: http://svn.apache.org/viewvc/aurora/site/source/documentation/0.14.0/development/thermos.md?rev=1748470&view=auto ============================================================================== --- aurora/site/source/documentation/0.14.0/development/thermos.md (added) +++ aurora/site/source/documentation/0.14.0/development/thermos.md Tue Jun 14 21:35:25 2016 @@ -0,0 +1,126 @@ +The Python components of Aurora are built using [Pants](https://pantsbuild.github.io). + + +Python Build Conventions +======================== +The Python code is laid out according to the following conventions: + +1. 1 `BUILD` per 3rd level directory. For a list of current top-level packages run: + + % find src/main/python -maxdepth 3 -mindepth 3 -type d |\ + while read dname; do echo $dname |\ + sed 's@src/main/python/\(.*\)/\(.*\)/\(.*\).*@\1.\2.\3@'; done + +2. Each `BUILD` file exports 1 + [`python_library`](https://pantsbuild.github.io/build_dictionary.html#bdict_python_library) + that provides a + [`setup_py`](https://pantsbuild.github.io/build_dictionary.html#setup_py) + containing each + [`python_binary`](https://pantsbuild.github.io/build_dictionary.html#python_binary) + in the `BUILD` file, named the same as the directory it's in so that it can be referenced + without a ':' character. The `sources` field in the `python_library` will almost always be + `rglobs('*.py')`. + +3. Other BUILD files may only depend on this single public `python_library` + target. Any other target is considered a private implementation detail and + should be prefixed with an `_`. + +4. `python_binary` targets are always named the same as the exported console script. + +5. `python_binary` targets must have identical `dependencies` to the `python_library` exported + by the package and must use `entry_point`. + + The means a PEX file generated by pants will contain exactly the same files that will be + available on the `PYTHONPATH` in the case of `pip install` of the corresponding library + target. This will help our migration off of Pants in the future. + +Annotated example - apache.thermos.runner +----------------------------------------- + + % find src/main/python/apache/thermos/runner + src/main/python/apache/thermos/runner + src/main/python/apache/thermos/runner/__init__.py + src/main/python/apache/thermos/runner/thermos_runner.py + src/main/python/apache/thermos/runner/BUILD + % cat src/main/python/apache/thermos/runner/BUILD + # License boilerplate omitted + import os + + + # Private target so that a setup_py can exist without a circular dependency. Only targets within + # this file should depend on this. + python_library( + name = '_runner', + # The target covers every python file under this directory and subdirectories. + sources = rglobs('*.py'), + dependencies = [ + '3rdparty/python:twitter.common.app', + '3rdparty/python:twitter.common.log', + # Source dependencies are always referenced without a ':'. + 'src/main/python/apache/thermos/common', + 'src/main/python/apache/thermos/config', + 'src/main/python/apache/thermos/core', + ], + ) + + # Binary target for thermos_runner.pex. Nothing should depend on this - it's only used as an + # argument to ./pants binary. + python_binary( + name = 'thermos_runner', + # Use entry_point, not source so the files used here are the same ones tests see. + entry_point = 'apache.thermos.bin.thermos_runner', + dependencies = [ + # Notice that we depend only on the single private target from this BUILD file here. + ':_runner', + ], + ) + + # The public library that everyone importing the runner symbols uses. + # The test targets and any other dependent source code should depend on this. + python_library( + name = 'runner', + dependencies = [ + # Again, notice that we depend only on the single private target from this BUILD file here. + ':_runner', + ], + # We always provide a setup_py. This will cause any dependee libraries to automatically + # reference this library in their requirements.txt rather than copy the source files into their + # sdist. + provides = setup_py( + # Conventionally named and versioned. + name = 'apache.thermos.runner', + version = open(os.path.join(get_buildroot(), '.auroraversion')).read().strip().upper(), + ).with_binaries({ + # Every binary in this file should also be repeated here. + # Always use the dict-form of .with_binaries so that commands with dashes in their names are + # supported. + # The console script name is always the same as the PEX with .pex stripped. + 'thermos_runner': ':thermos_runner', + }), + ) + + + +Thermos Test resources +====================== + +The Aurora source repository and distributions contain several +[binary files](../../src/test/resources/org/apache/thermos/root/checkpoints) to +qualify the backwards-compatibility of thermos with checkpoint data. Since +thermos persists state to disk, to be read by the thermos observer), it is important that we have +tests that prevent regressions affecting the ability to parse previously-written data. + +The files included represent persisted checkpoints that exercise different +features of thermos. The existing files should not be modified unless +we are accepting backwards incompatibility, such as with a major release. + +It is not practical to write source code to generate these files on the fly, +as source would be vulnerable to drift (e.g. due to refactoring) in ways +that would undermine the goal of ensuring backwards compatibility. + +The most common reason to add a new checkpoint file would be to provide +coverage for new thermos features that alter the data format. This is +accomplished by writing and running a +[job configuration](../../reference/configuration/) that exercises the feature, and +copying the checkpoint file from the sandbox directory, by default this is +`/var/run/thermos/checkpoints/`. Added: aurora/site/source/documentation/0.14.0/development/thrift.md URL: http://svn.apache.org/viewvc/aurora/site/source/documentation/0.14.0/development/thrift.md?rev=1748470&view=auto ============================================================================== --- aurora/site/source/documentation/0.14.0/development/thrift.md (added) +++ aurora/site/source/documentation/0.14.0/development/thrift.md Tue Jun 14 21:35:25 2016 @@ -0,0 +1,54 @@ +Thrift +====== + +Aurora uses [Apache Thrift](https://thrift.apache.org/) for representing structured data in +client/server RPC protocol as well as for internal data storage. While Thrift is capable of +correctly handling additions and renames of the existing members, field removals must be done +carefully to ensure backwards compatibility and provide predictable deprecation cycle. This +document describes general guidelines for making Thrift schema changes to the existing fields in +[api.thrift](https://github.com/apache/aurora/blob/rel/0.14.0/api/src/main/thrift/org/apache/aurora/gen/api.thrift). + +It is highly recommended to go through the +[Thrift: The Missing Guide](http://diwakergupta.github.io/thrift-missing-guide/) first to refresh on +basic Thrift schema concepts. + +Checklist +--------- +Every existing Thrift schema modification is unique in its requirements and must be analyzed +carefully to identify its scope and expected consequences. The following checklist may help in that +analysis: +* Is this a new field/struct? If yes, go ahead +* Is this a pure field/struct rename without any type/structure change? If yes, go ahead and rename +* Anything else, read further to make sure your change is properly planned + +Deprecation cycle +----------------- +Any time a breaking change (e.g.: field replacement or removal) is required, the following cycle +must be followed: + +### vCurrent +Change is applied in a way that does not break scheduler/client with this version to +communicate with scheduler/client from vCurrent-1. +* Do not remove or rename the old field +* Add a new field as an eventual replacement of the old one and implement a dual read/write +anywhere the old field is used. If a thrift struct is mapped in the DB store make sure both columns +are marked as `NOT NULL` +* Check [storage.thrift](https://github.com/apache/aurora/blob/rel/0.14.0/api/src/main/thrift/org/apache/aurora/gen/storage.thrift) to see if +the affected struct is stored in Aurora scheduler storage. If so, it's almost certainly also +necessary to perform a [DB migration](../db-migration/). +* Add a deprecation jira ticket into the vCurrent+1 release candidate +* Add a TODO for the deprecated field mentioning the jira ticket + +### vCurrent+1 +Finalize the change by removing the deprecated fields from the Thrift schema. +* Drop any dual read/write routines added in the previous version +* Remove thrift backfilling in scheduler +* Remove the deprecated Thrift field + +Testing +------- +It's always advisable to test your changes in the local vagrant environment to build more +confidence that you change is backwards compatible. It's easy to simulate different +client/scheduler versions by playing with `aurorabuild` command. See [this document](../../getting-started/vagrant/) +for more. + Added: aurora/site/source/documentation/0.14.0/development/ui.md URL: http://svn.apache.org/viewvc/aurora/site/source/documentation/0.14.0/development/ui.md?rev=1748470&view=auto ============================================================================== --- aurora/site/source/documentation/0.14.0/development/ui.md (added) +++ aurora/site/source/documentation/0.14.0/development/ui.md Tue Jun 14 21:35:25 2016 @@ -0,0 +1,46 @@ +Developing the Aurora Scheduler UI +================================== + +Installing bower (optional) +---------------------------- +Third party JS libraries used in Aurora (located at 3rdparty/javascript/bower_components) are +managed by bower, a JS dependency manager. Bower is only required if you plan to add, remove or +update JS libraries. Bower can be installed using the following command: + + npm install -g bower + +Bower depends on node.js and npm. The easiest way to install node on a mac is via brew: + + brew install node + +For more node.js installation options refer to https://github.com/joyent/node/wiki/Installation. + +More info on installing and using bower can be found at: http://bower.io/. Once installed, you can +use the following commands to view and modify the bower repo at +3rdparty/javascript/bower_components + + bower list + bower install + bower remove + bower update + bower help + + +Faster Iteration in Vagrant +--------------------------- +The scheduler serves UI assets from the classpath. For production deployments this means the assets +are served from within a jar. However, for faster development iteration, the vagrant image is +configured to add the `scheduler` subtree of `/vagrant/dist/resources/main` to the head of +`CLASSPATH`. This path is configured as a shared filesystem to the path on the host system where +your Aurora repository lives. This means that any updates under `dist/resources/main/scheduler` in +your checkout will be reflected immediately in the UI served from within the vagrant image. + +The one caveat to this is that this path is under `dist` not `src`. This is because the assets must +be processed by gradle before they can be served. So, unfortunately, you cannot just save your local +changes and see them reflected in the UI, you must first run `./gradlew processResources`. This is +less than ideal, but better than having to restart the scheduler after every change. Additionally, +gradle makes this process somewhat easier with the use of the `--continuous` flag. If you run: +`./gradlew processResources --continuous` gradle will monitor the filesystem for changes and run the +task automatically as necessary. This doesn't quite provide hot-reload capabilities, but it does +allow for <5s from save to changes being visibile in the UI with no further action required on the +part of the developer. Added: aurora/site/source/documentation/0.14.0/features/constraints.md URL: http://svn.apache.org/viewvc/aurora/site/source/documentation/0.14.0/features/constraints.md?rev=1748470&view=auto ============================================================================== --- aurora/site/source/documentation/0.14.0/features/constraints.md (added) +++ aurora/site/source/documentation/0.14.0/features/constraints.md Tue Jun 14 21:35:25 2016 @@ -0,0 +1,126 @@ +Scheduling Constraints +====================== + +By default, Aurora will pick any random agent with sufficient resources +in order to schedule a task. This scheduling choice can be further +restricted with the help of constraints. + + +Mesos Attributes +---------------- + +Data centers are often organized with hierarchical failure domains. Common failure domains +include hosts, racks, rows, and PDUs. If you have this information available, it is wise to tag +the Mesos agent with them as +[attributes](https://mesos.apache.org/documentation/attributes-resources/). + +The Mesos agent `--attributes` command line argument can be used to mark agents with +static key/value pairs, so called attributes (not to be confused with `--resources`, which are +dynamic and accounted). + +For example, consider the host `cluster1-aaa-03-sr2` and its following attributes (given in +key:value format): `host:cluster1-aaa-03-sr2` and `rack:aaa`. + +Aurora makes these attributes available for matching with scheduling constraints. + + +Limit Constraints +----------------- + +Limit constraints allow to control machine diversity using constraints. The below +constraint ensures that no more than two instances of your job may run on a single host. +Think of this as a "group by" limit. + + Service( + name = 'webservice', + role = 'www-data', + constraints = { + 'host': 'limit:2', + } + ... + ) + + +Likewise, you can use constraints to control rack diversity, e.g. at +most one task per rack: + + constraints = { + 'rack': 'limit:1', + } + +Use these constraints sparingly as they can dramatically reduce Tasks' schedulability. +Further details are available in the reference documentation on +[Scheduling Constraints](../../reference/configuration/#specifying-scheduling-constraints). + + + +Value Constraints +----------------- + +Value constraints can be used to express that a certain attribute with a certain value +should be present on a Mesos agent. For example, the following job would only be +scheduled on nodes that claim to have an `SSD` as their disk. + + Service( + name = 'webservice', + role = 'www-data', + constraints = { + 'disk': 'SSD', + } + ... + ) + + +Further details are available in the reference documentation on +[Scheduling Constraints](../../reference/configuration/#specifying-scheduling-constraints). + + +Running stateful services +------------------------- + +Aurora is best suited to run stateless applications, but it also accommodates for stateful services +like databases, or services that otherwise need to always run on the same machines. + +### Dedicated attribute + +Most of the Mesos attributes arbitrary and available for custom use. There is one exception, +though: the `dedicated` attribute. Aurora treats this specially, and only allows matching jobs to +run on these machines, and will only schedule matching jobs on these machines. + + +#### Syntax +The dedicated attribute has semantic meaning. The format is `$role(/.*)?`. When a job is created, +the scheduler requires that the `$role` component matches the `role` field in the job +configuration, and will reject the job creation otherwise. The remainder of the attribute is +free-form. We've developed the idiom of formatting this attribute as `$role/$job`, but do not +enforce this. For example: a job `devcluster/www-data/prod/hello` with a dedicated constraint set as +`www-data/web.multi` will have its tasks scheduled only on Mesos agents configured with: +`--attributes=dedicated:www-data/web.multi`. + +A wildcard (`*`) may be used for the role portion of the dedicated attribute, which will allow any +owner to elect for a job to run on the host(s). For example: tasks from both +`devcluster/www-data/prod/hello` and `devcluster/vagrant/test/hello` with a dedicated constraint +formatted as `*/web.multi` will be scheduled only on Mesos agents configured with +`--attributes=dedicated:*/web.multi`. This may be useful when assembling a virtual cluster of +machines sharing the same set of traits or requirements. + +##### Example +Consider the following agent command line: + + mesos-slave --attributes="dedicated:db_team/redis" ... + +And this job configuration: + + Service( + name = 'redis', + role = 'db_team', + constraints = { + 'dedicated': 'db_team/redis' + } + ... + ) + +The job configuration is indicating that it should only be scheduled on agents with the attribute +`dedicated:db_team/redis`. Additionally, Aurora will prevent any tasks that do _not_ have that +constraint from running on those agents. + Added: aurora/site/source/documentation/0.14.0/features/containers.md URL: http://svn.apache.org/viewvc/aurora/site/source/documentation/0.14.0/features/containers.md?rev=1748470&view=auto ============================================================================== --- aurora/site/source/documentation/0.14.0/features/containers.md (added) +++ aurora/site/source/documentation/0.14.0/features/containers.md Tue Jun 14 21:35:25 2016 @@ -0,0 +1,60 @@ +Containers +========== + +Docker +------ + +Aurora has optional support for launching Docker containers, if correctly [configured by an Operator](../../operations/configuration/#docker-containers). + +Example (available in the [Vagrant environment](../../getting-started/vagrant/)): + + + $ cat /vagrant/examples/jobs/docker/hello_docker.aurora + hello_world_proc = Process( + name = 'hello', + cmdline = """ + while true; do + echo hello world + sleep 10 + done + """) + + hello_world_docker = Task( + name = 'hello docker', + processes = [hello_world_proc], + resources = Resources(cpu = 1, ram = 1*MB, disk=8*MB) + ) + + jobs = [ + Service( + cluster = 'devcluster', + environment = 'devel', + role = 'docker-test', + name = 'hello_docker', + task = hello_world_docker, + container = Container(docker = Docker(image = 'python:2.7')) + ) + ] + + +In order to correctly execute processes inside a job, the docker container must have Python 2.7 +installed. Further details of how to use Docker can be found in the +[Reference Documentation](../../reference/configuration/#docker-object). + +Mesos +----- + +*Note: In order to use filesystem images with Aurora, you must be running at least Mesos 0.28.x* + +Aurora supports specifying a task filesystem image to use with the [Mesos containerizer](http://mesos.apache.org/documentation/latest/container-image/). +This is done by setting the ```container``` property of the Job to a ```Mesos``` container object +that includes the image to use. Both [AppC](https://github.com/appc/spec/blob/master/SPEC.md) and +[Docker](https://github.com/docker/docker/blob/master/image/spec/v1.md) images are supported. + +``` +job = Job( + ... + container = Mesos(image=DockerImage(name='my-image', tag='my-tag')) + ... +) +``` Added: aurora/site/source/documentation/0.14.0/features/cron-jobs.md URL: http://svn.apache.org/viewvc/aurora/site/source/documentation/0.14.0/features/cron-jobs.md?rev=1748470&view=auto ============================================================================== --- aurora/site/source/documentation/0.14.0/features/cron-jobs.md (added) +++ aurora/site/source/documentation/0.14.0/features/cron-jobs.md Tue Jun 14 21:35:25 2016 @@ -0,0 +1,124 @@ +# Cron Jobs + +Aurora supports execution of scheduled jobs on a Mesos cluster using cron-style syntax. + +- [Overview](#overview) +- [Collision Policies](#collision-policies) +- [Failure recovery](#failure-recovery) +- [Interacting with cron jobs via the Aurora CLI](#interacting-with-cron-jobs-via-the-aurora-cli) + - [cron schedule](#cron-schedule) + - [cron deschedule](#cron-deschedule) + - [cron start](#cron-start) + - [job killall, job restart, job kill](#job-killall-job-restart-job-kill) +- [Technical Note About Syntax](#technical-note-about-syntax) +- [Caveats](#caveats) + - [Failovers](#failovers) + - [Collision policy is best-effort](#collision-policy-is-best-effort) + - [Timezone Configuration](#timezone-configuration) + +## Overview + +A job is identified as a cron job by the presence of a +`cron_schedule` attribute containing a cron-style schedule in the +[`Job`](../../reference/configuration/#job-objects) object. Examples of cron schedules +include "every 5 minutes" (`*/5 * * * *`), "Fridays at 17:00" (`* 17 * * FRI`), and +"the 1st and 15th day of the month at 03:00" (`0 3 1,15 *`). + +Example (available in the [Vagrant environment](../../getting-started/vagrant/)): + + $ cat /vagrant/examples/jobs/cron_hello_world.aurora + # A cron job that runs every 5 minutes. + jobs = [ + Job( + cluster = 'devcluster', + role = 'www-data', + environment = 'test', + name = 'cron_hello_world', + cron_schedule = '*/5 * * * *', + task = SimpleTask( + 'cron_hello_world', + 'echo "Hello world from cron, the time is now $(date --rfc-822)"'), + ), + ] + +## Collision Policies + +The `cron_collision_policy` field specifies the scheduler's behavior when a new cron job is +triggered while an older run hasn't finished. The scheduler has two policies available: + +* `KILL_EXISTING`: The default policy - on a collision the old instances are killed and a instances with the current +configuration are started. +* `CANCEL_NEW`: On a collision the new run is cancelled. + +Note that the use of `CANCEL_NEW` is likely a code smell - interrupted cron jobs should be able +to recover their progress on a subsequent invocation, otherwise they risk having their work queue +grow faster than they can process it. + +## Failure recovery + +Unlike with services, which aurora will always re-execute regardless of exit status, instances of +cron jobs retry according to the `max_task_failures` attribute of the +[Task](../../reference/configuration/#task-object) object. To get "run-until-success" semantics, +set `max_task_failures` to `-1`. + +## Interacting with cron jobs via the Aurora CLI + +Most interaction with cron jobs takes place using the `cron` subcommand. See `aurora cron -h` +for up-to-date usage instructions. + +### cron schedule +Schedules a new cron job on the Aurora cluster for later runs or replaces the existing cron template +with a new one. Only future runs will be affected, any existing active tasks are left intact. + + $ aurora cron schedule devcluster/www-data/test/cron_hello_world /vagrant/examples/jobs/cron_hello_world.aurora + +### cron deschedule +Deschedules a cron job, preventing future runs but allowing current runs to complete. + + $ aurora cron deschedule devcluster/www-data/test/cron_hello_world + +### cron start +Start a cron job immediately, outside of its normal cron schedule. + + $ aurora cron start devcluster/www-data/test/cron_hello_world + +### job killall, job restart, job kill +Cron jobs create instances running on the cluster that you can interact with like normal Aurora +tasks with `job kill` and `job restart`. + + +## Technical Note About Syntax + +`cron_schedule` uses a restricted subset of BSD crontab syntax. While the +execution engine currently uses Quartz, the schedule parsing is custom, a subset of FreeBSD +[crontab(5)](http://www.freebsd.org/cgi/man.cgi?crontab(5)) syntax. See +[the source](https://github.com/apache/aurora/blob/master/src/main/java/org/apache/aurora/scheduler/cron/CrontabEntry.java#L106-L124) +for details. + + +## Caveats + +### Failovers +No failover recovery. Aurora does not record the latest minute it fired +triggers for across failovers. Therefore it's possible to miss triggers +on failover. Note that this behavior may change in the future. + +It's necessary to sync time between schedulers with something like `ntpd`. +Clock skew could cause double or missed triggers in the case of a failover. + +### Collision policy is best-effort +Aurora aims to always have *at least one copy* of a given instance running at a time - it's +an AP system, meaning it chooses Availability and Partition Tolerance at the expense of +Consistency. + +If your collision policy was `CANCEL_NEW` and a task has terminated but +Aurora has not noticed this Aurora will go ahead and create your new +task. + +If your collision policy was `KILL_EXISTING` and a task was marked `LOST` +but not yet GCed Aurora will go ahead and create your new task without +attempting to kill the old one (outside the GC interval). + +### Timezone Configuration +Cron timezone is configured indepdendently of JVM timezone with the `-cron_timezone` flag and +defaults to UTC. Added: aurora/site/source/documentation/0.14.0/features/job-updates.md URL: http://svn.apache.org/viewvc/aurora/site/source/documentation/0.14.0/features/job-updates.md?rev=1748470&view=auto ============================================================================== --- aurora/site/source/documentation/0.14.0/features/job-updates.md (added) +++ aurora/site/source/documentation/0.14.0/features/job-updates.md Tue Jun 14 21:35:25 2016 @@ -0,0 +1,111 @@ +Aurora Job Updates +================== + +`Job` configurations can be updated at any point in their lifecycle. +Usually updates are done incrementally using a process called a *rolling +upgrade*, in which Tasks are upgraded in small groups, one group at a +time. Updates are done using various Aurora Client commands. + + +Rolling Job Updates +------------------- + +There are several sub-commands to manage job updates: + + aurora update start + aurora update info + aurora update pause + aurora update resume + aurora update abort + aurora update list + +When you `start` a job update, the command will return once it has sent the +instructions to the scheduler. At that point, you may view detailed +progress for the update with the `info` subcommand, in addition to viewing +graphical progress in the web browser. You may also get a full listing of +in-progress updates in a cluster with `list`. + +Once an update has been started, you can `pause` to keep the update but halt +progress. This can be useful for doing things like debug a partially-updated +job to determine whether you would like to proceed. You can `resume` to +proceed. + +You may `abort` a job update regardless of the state it is in. This will +instruct the scheduler to completely abandon the job update and leave the job +in the current (possibly partially-updated) state. + +For a configuration update, the Aurora Client calculates required changes +by examining the current job config state and the new desired job config. +It then starts a *rolling batched update process* by going through every batch +and performing these operations: + +- If an instance is present in the scheduler but isn't in the new config, + then that instance is killed. +- If an instance is not present in the scheduler but is present in + the new config, then the instance is created. +- If an instance is present in both the scheduler and the new config, then + the client diffs both task configs. If it detects any changes, it + performs an instance update by killing the old config instance and adds + the new config instance. + +The Aurora client continues through the instance list until all tasks are +updated, in `RUNNING,` and healthy for a configurable amount of time. +If the client determines the update is not going well (a percentage of health +checks have failed), it cancels the update. + +Update cancellation runs a procedure similar to the described above +update sequence, but in reverse order. New instance configs are swapped +with old instance configs and batch updates proceed backwards +from the point where the update failed. E.g.; (0,1,2) (3,4,5) (6,7, +8-FAIL) results in a rollback in order (8,7,6) (5,4,3) (2,1,0). + +For details how to control a job update, please see the +[UpdateConfig](../../reference/configuration/#updateconfig-objects) configuration object. + + +Coordinated Job Updates +------------------------ + +Some Aurora services may benefit from having more control over updates by explicitly +acknowledging ("heartbeating") job update progress. This may be helpful for mission-critical +service updates where explicit job health monitoring is vital during the entire job update +lifecycle. Such job updates would rely on an external service (or a custom client) periodically +pulsing an active coordinated job update via a +[pulseJobUpdate RPC](https://github.com/apache/aurora/blob/rel/0.14.0/api/src/main/thrift/org/apache/aurora/gen/api.thrift). + +A coordinated update is defined by setting a positive +[pulse_interval_secs](../../reference/configuration/#updateconfig-objects) value in job configuration +file. If no pulses are received within specified interval the update will be blocked. A blocked +update is unable to continue rolling forward (or rolling back) but retains its active status. +It may only be unblocked by a fresh `pulseJobUpdate` call. + +NOTE: A coordinated update starts in `ROLL_FORWARD_AWAITING_PULSE` state and will not make any +progress until the first pulse arrives. However, a paused update (`ROLL_FORWARD_PAUSED` or +`ROLL_BACK_PAUSED`) is still considered active and upon resuming will immediately make progress +provided the pulse interval has not expired. + + +Canary Deployments +------------------ + +Canary deployments are a pattern for rolling out updates to a subset of job instances, +in order to test different code versions alongside the actual production job. +It is a risk-mitigation strategy for job owners and commonly used in a form where +job instance 0 runs with a different configuration than the instances 1-N. + +For example, consider a job with 4 instances that each +request 1 core of cpu, 1 GB of RAM, and 1 GB of disk space as specified +in the configuration file `hello_world.aurora`. If you want to +update it so it requests 2 GB of RAM instead of 1. You can create a new +configuration file to do that called `new_hello_world.aurora` and +issue + + aurora update start /0-1 new_hello_world.aurora + +This results in instances 0 and 1 having 1 cpu, 2 GB of RAM, and 1 GB of disk space, +while instances 2 and 3 have 1 cpu, 1 GB of RAM, and 1 GB of disk space. If instance 3 +dies and restarts, it restarts with 1 cpu, 1 GB RAM, and 1 GB disk space. + +So that means there are two simultaneous task configurations for the same job +at the same time, just valid for different ranges of instances. While this isn't a recommended +pattern, it is valid and supported by the Aurora scheduler. Added: aurora/site/source/documentation/0.14.0/features/multitenancy.md URL: http://svn.apache.org/viewvc/aurora/site/source/documentation/0.14.0/features/multitenancy.md?rev=1748470&view=auto ============================================================================== --- aurora/site/source/documentation/0.14.0/features/multitenancy.md (added) +++ aurora/site/source/documentation/0.14.0/features/multitenancy.md Tue Jun 14 21:35:25 2016 @@ -0,0 +1,62 @@ +Multitenancy +============ + +Aurora is a multi-tenant system that can run jobs of multiple clients/tenants. +Going beyond the [resource isolation on an individual host](../resource-isolation/), it is +crucial to prevent those jobs from stepping on each others toes. + + +Job Namespaces +-------------- + +The namespace for jobs in Aurora follows a hierarchical structure. This is meant to make it easier +to differentiate between different jobs. A job key consists of four parts. The four parts are +`///` in that order: + +* Cluster refers to the name of a particular Aurora installation. +* Role names are user accounts. +* Environment names are namespaces. +* Jobname is the custom name of your job. + +Role names correspond to user accounts. They are used for +[authentication](../../operations/security/), as the linux user used to run jobs, and for the +assignment of [quota](#preemption). If you don't know what accounts are available, contact your +sysadmin. + +The environment component in the job key, serves as a namespace. The values for +environment are validated in the client and the scheduler so as to allow any of `devel`, `test`, +`production`, and any value matching the regular expression `staging[0-9]*`. + +None of the values imply any difference in the scheduling behavior. Conventionally, the +"environment" is set so as to indicate a certain level of stability in the behavior of the job +by ensuring that an appropriate level of testing has been performed on the application code. e.g. +in the case of a typical Job, releases may progress through the following phases in order of +increasing level of stability: `devel`, `test`, `staging`, `production`. + + +Preemption +---------- + +In order to guarantee that important production jobs are always running, Aurora supports +preemption. + +Let's consider we have a pending job that is candidate for scheduling but resource shortage pressure +prevents this. Active tasks can become the victim of preemption, if: + + - both candidate and victim are owned by the same role and the + [priority](../../reference/configuration/#job-objects) of a victim is lower than the + [priority](../../reference/configuration/#job-objects) of the candidate. + - OR a victim is non-[production](../../reference/configuration/#job-objects) and the candidate is + [production](../../reference/configuration/#job-objects). + +In other words, tasks from [production](../../reference/configuration/#job-objects) jobs may preempt +tasks from any non-production job. However, a production task may only be preempted by tasks from +production jobs in the same role with higher [priority](../../reference/configuration/#job-objects). + +Aurora requires resource quotas for [production non-dedicated jobs](../../reference/configuration/#job-objects). +Quota is enforced at the job role level and when set, defines a non-preemptible pool of compute resources within +that role. All job types (service, adhoc or cron) require role resource quota unless a job has +[dedicated constraint set](../constraints/#dedicated-attribute). + +To grant quota to a particular role in production, an operator can use the command +`aurora_admin set_quota`. Added: aurora/site/source/documentation/0.14.0/features/resource-isolation.md URL: http://svn.apache.org/viewvc/aurora/site/source/documentation/0.14.0/features/resource-isolation.md?rev=1748470&view=auto ============================================================================== --- aurora/site/source/documentation/0.14.0/features/resource-isolation.md (added) +++ aurora/site/source/documentation/0.14.0/features/resource-isolation.md Tue Jun 14 21:35:25 2016 @@ -0,0 +1,179 @@ +Resources Isolation and Sizing +============================== + +- [Isolation](#isolation) +- [Sizing](#sizing) +- [Oversubscription](#oversubscription) + + +Isolation +--------- + +Aurora is a multi-tenant system; a single software instance runs on a +server, serving multiple clients/tenants. To share resources among +tenants, it implements isolation of: + +* CPU +* memory +* disk space + +CPU is a soft limit, and handled differently from memory and disk space. +Too low a CPU value results in throttling your application and +slowing it down. Memory and disk space are both hard limits; when your +application goes over these values, it's killed. + +### CPU Isolation + +Mesos uses a quota based CPU scheduler (the *Completely Fair Scheduler*) +to provide consistent and predictable performance. This is effectively +a guarantee of resources -- you receive at least what you requested, but +also no more than you've requested. + +The scheduler gives applications a CPU quota for every 100 ms interval. +When an application uses its quota for an interval, it is throttled for +the rest of the 100 ms. Usage resets for each interval and unused +quota does not carry over. + +For example, an application specifying 4.0 CPU has access to 400 ms of +CPU time every 100 ms. This CPU quota can be used in different ways, +depending on the application and available resources. Consider the +scenarios shown in this diagram. + +![CPU Availability](../images/CPUavailability.png) + +* *Scenario A*: the application can use up to 4 cores continuously for +every 100 ms interval. It is never throttled and starts processing +new requests immediately. + +* *Scenario B* : the application uses up to 8 cores (depending on +availability) but is throttled after 50 ms. The CPU quota resets at the +start of each new 100 ms interval. + +* *Scenario C* : is like Scenario A, but there is a garbage collection +event in the second interval that consumes all CPU quota. The +application throttles for the remaining 75 ms of that interval and +cannot service requests until the next interval. In this example, the +garbage collection finished in one interval but, depending on how much +garbage needs collecting, it may take more than one interval and further +delay service of requests. + +*Technical Note*: Mesos considers logical cores, also known as +hyperthreading or SMT cores, as the unit of CPU. + +### Memory Isolation + +Mesos uses dedicated memory allocation. Your application always has +access to the amount of memory specified in your configuration. The +application's memory use is defined as the sum of the resident set size +(RSS) of all processes in a shard. Each shard is considered +independently. + +In other words, say you specified a memory size of 10GB. Each shard +would receive 10GB of memory. If an individual shard's memory demands +exceed 10GB, that shard is killed, but the other shards continue +working. + +*Technical note*: Total memory size is not enforced at allocation time, +so your application can request more than its allocation without getting +an ENOMEM. However, it will be killed shortly after. + +### Disk Space + +Disk space used by your application is defined as the sum of the files' +disk space in your application's directory, including the `stdout` and +`stderr` logged from your application. Each shard is considered +independently. You should use off-node storage for your application's +data whenever possible. + +In other words, say you specified disk space size of 100MB. Each shard +would receive 100MB of disk space. If an individual shard's disk space +demands exceed 100MB, that shard is killed, but the other shards +continue working. + +After your application finishes running, its allocated disk space is +reclaimed. Thus, your job's final action should move any disk content +that you want to keep, such as logs, to your home file system or other +less transitory storage. Disk reclamation takes place an undefined +period after the application finish time; until then, the disk contents +are still available but you shouldn't count on them being so. + +*Technical note* : Disk space is not enforced at write so your +application can write above its quota without getting an ENOSPC, but it +will be killed shortly after. This is subject to change. + +### GPU Isolation + +GPU isolation will be supported for Nvidia devices starting from Mesos 0.29.0. +Access to the allocated units will be exclusive with no sharing between tasks +allowed (e.g. no fractional GPU allocation). Until official documentation is released, +see [Mesos design document](https://docs.google.com/document/d/10GJ1A80x4nIEo8kfdeo9B11PIbS1xJrrB4Z373Ifkpo/edit#heading=h.w84lz7p4eexl) +for more details. + +### Other Resources + +Other resources, such as network bandwidth, do not have any performance +guarantees. For some resources, such as memory bandwidth, there are no +practical sharing methods so some application combinations collocated on +the same host may cause contention. + + +Sizing +------- + +### CPU Sizing + +To correctly size Aurora-run Mesos tasks, specify a per-shard CPU value +that lets the task run at its desired performance when at peak load +distributed across all shards. Include reserve capacity of at least 50%, +possibly more, depending on how critical your service is (or how +confident you are about your original estimate : -)), ideally by +increasing the number of shards to also improve resiliency. When running +your application, observe its CPU stats over time. If consistently at or +near your quota during peak load, you should consider increasing either +per-shard CPU or the number of shards. + +## Memory Sizing + +Size for your application's peak requirement. Observe the per-instance +memory statistics over time, as memory requirements can vary over +different periods. Remember that if your application exceeds its memory +value, it will be killed, so you should also add a safety margin of +around 10-20%. If you have the ability to do so, you may also want to +put alerts on the per-instance memory. + +## Disk Space Sizing + +Size for your application's peak requirement. Rotate and discard log +files as needed to stay within your quota. When running a Java process, +add the maximum size of the Java heap to your disk space requirement, in +order to account for an out of memory error dumping the heap +into the application's sandbox space. + +## GPU Sizing + +GPU is highly dependent on your application requirements and is only limited +by the number of physical GPU units available on a target box. + +Oversubscription +---------------- + +**WARNING**: This feature is currently in alpha status. Do not use it in production clusters! + +Mesos [supports a concept of revocable tasks](http://mesos.apache.org/documentation/latest/oversubscription/) +by oversubscribing machine resources by the amount deemed safe to not affect the existing +non-revocable tasks. Aurora now supports revocable jobs via a `tier` setting set to `revocable` +value. + +The Aurora scheduler must be configured to receive revocable offers from Mesos and accept revocable +jobs. If not configured properly revocable tasks will never get assigned to hosts and will stay in +`PENDING`. Set these scheduler flag to allow receiving revocable Mesos offers: + + -receive_revocable_resources=true + +Specify a tier configuration file path (unless you want to use the [default](https://github.com/apache/aurora/blob/rel/0.14.0/src/main/resources/org/apache/aurora/scheduler/tiers.json)): + + -tier_config=path/to/tiers/config.json + + +See the [Configuration Reference](../../reference/configuration/) for details on how to mark a job +as being revocable. Added: aurora/site/source/documentation/0.14.0/features/service-discovery.md URL: http://svn.apache.org/viewvc/aurora/site/source/documentation/0.14.0/features/service-discovery.md?rev=1748470&view=auto ============================================================================== --- aurora/site/source/documentation/0.14.0/features/service-discovery.md (added) +++ aurora/site/source/documentation/0.14.0/features/service-discovery.md Tue Jun 14 21:35:25 2016 @@ -0,0 +1,42 @@ +Service Discovery +================= + +It is possible for the Aurora executor to announce tasks into ServerSets for +the purpose of service discovery. ServerSets use the Zookeeper [group membership pattern](http://zookeeper.apache.org/doc/trunk/recipes.html#sc_outOfTheBox) +of which there are several reference implementations: + + - [C++](https://github.com/apache/mesos/blob/master/src/zookeeper/group.cpp) + - [Java](https://github.com/twitter/commons/blob/master/src/java/com/twitter/common/zookeeper/ServerSetImpl.java#L221) + - [Python](https://github.com/twitter/commons/blob/master/src/python/twitter/common/zookeeper/serverset/serverset.py#L51) + +These can also be used natively in Finagle using the [ZookeeperServerSetCluster](https://github.com/twitter/finagle/blob/master/finagle-serversets/src/main/scala/com/twitter/finagle/zookeeper/ZookeeperServerSetCluster.scala). + +For more information about how to configure announcing, see the [Configuration Reference](../../reference/configuration/). + +Using Mesos DiscoveryInfo +------------------------- +Experimental support for populating DiscoveryInfo in Mesos is introduced in Aurora. This can be used to build +custom service discovery system not using zookeeper. Please see `Service Discovery` section in +[Mesos Framework Development guide](http://mesos.apache.org/documentation/latest/app-framework-development-guide/) for +explanation of the protobuf message in Mesos. + +To use this feature, please enable `--populate_discovery_info` flag on scheduler. All jobs started by scheduler +afterwards will have their portmap populated to Mesos and discoverable in `/state` endpoint in Mesos master and agent. + +### Using Mesos DNS +An example is using [Mesos-DNS](https://github.com/mesosphere/mesos-dns), which is able to generate multiple DNS +records. With current implementation, the example job with key `devcluster/vagrant/test/http-example` generates at +least the following: + +1. An A record for `http_example.test.vagrant.aurora.mesos` (which only includes IP address); +2. A [SRV record](https://en.wikipedia.org/wiki/SRV_record) for + `_http_example.test.vagrant._tcp.aurora.mesos`, which includes IP address and every port. This should only + be used if the service has one port. +3. A SRV record `_{port-name}._http_example.test.vagrant._tcp.aurora.mesos` for each port name + defined. This should be used when the service has multiple ports. + +Things to note: + +1. The domain part (".mesos" in above example) can be configured in [Mesos DNS](http://mesosphere.github.io/mesos-dns/docs/configuration-parameters.html); +2. Right now, portmap and port aliases in announcer object are not reflected in DiscoveryInfo, therefore not visible in + Mesos DNS records either. This is because they are only resolved in thermos executors.