aurora-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
Subject svn commit: r1799392 [13/14] - in /aurora/site: publish/blog/aurora-0-18-0-released/ publish/documentation/0.18.0/ publish/documentation/0.18.0/additional-resources/ publish/documentation/0.18.0/additional-resources/presentations/ publish/documentation...
Date Wed, 21 Jun 2017 06:36:25 GMT
Added: aurora/site/source/documentation/0.18.0/reference/
--- aurora/site/source/documentation/0.18.0/reference/ (added)
+++ aurora/site/source/documentation/0.18.0/reference/ Wed Jun 21 06:36:21 2017
@@ -0,0 +1,306 @@
+Aurora Configuration Templating
+The `.aurora` file format is just Python. However, `Job`, `Task`,
+`Process`, and other classes are defined by a templating library called
+*Pystachio*, a powerful tool for configuration specification and reuse.
+[Aurora Configuration Reference](../configuration/)
+has a full reference of all Aurora/Thermos defined Pystachio objects.
+When writing your `.aurora` file, you may use any Pystachio datatypes, as
+well as any objects shown in the *Aurora+Thermos Configuration
+Reference* without `import` statements - the Aurora config loader
+injects them automatically. Other than that the `.aurora` format
+works like any other Python script.
+Templating 1: Binding in Pystachio
+Pystachio uses the visually distinctive {{}} to indicate template
+variables. These are often called "mustache variables" after the
+similarly appearing variables in the Mustache templating system and
+because the curly braces resemble mustaches.
+If you are familiar with the Mustache system, templates in Pystachio
+have significant differences. They have no nesting, joining, or
+inheritance semantics. On the other hand, when evaluated, templates
+are evaluated iteratively, so this affords some level of indirection.
+Let's start with the simplest template; text with one
+variable, in this case `name`;
+    Hello {{name}}
+If we evaluate this as is, we'd get back:
+    Hello
+If a template variable doesn't have a value, when evaluated it's
+replaced with nothing. If we add a binding to give it a value:
+    { "name" : "Tom" }
+We'd get back:
+    Hello Tom
+Every Pystachio object has an associated `.bind` method that can bind
+values to {{}} variables. Bindings are not immediately evaluated.
+Instead, they are evaluated only when the interpolated value of the
+object is necessary, e.g. for performing equality or serializing a
+message over the wire.
+Objects with and without mustache templated variables behave
+    >>> Float(1.5)
+    Float(1.5)
+    >>> Float('{{x}}.5')
+    Float({{x}}.5)
+    >>> Float('{{x}}.5').bind(x = 1)
+    Float(1.5)
+    >>> Float('{{x}}.5').bind(x = 1) == Float(1.5)
+    True
+    >>> contextual_object = String('{{metavar{{number}}}}').bind(
+    ... metavar1 = "first", metavar2 = "second")
+    >>> contextual_object
+    String({{metavar{{number}}}})
+    >>> contextual_object.bind(number = 1)
+    String(first)
+    >>> contextual_object.bind(number = 2)
+    String(second)
+You usually bind simple key to value pairs, but you can also bind three
+other objects: lists, dictionaries, and structurals. These will be
+described in detail later.
+### Structurals in Pystachio / Aurora
+Most Aurora/Thermos users don't ever (knowingly) interact with `String`,
+`Float`, or `Integer` Pystashio objects directly. Instead they interact
+with derived structural (`Struct`) objects that are collections of
+fundamental and structural objects. The structural object components are
+called *attributes*. Aurora's most used structural objects are `Job`,
+`Task`, and `Process`:
+    class Process(Struct):
+      cmdline = Required(String)
+      name = Required(String)
+      max_failures = Default(Integer, 1)
+      daemon = Default(Boolean, False)
+      ephemeral = Default(Boolean, False)
+      min_duration = Default(Integer, 5)
+      final = Default(Boolean, False)
+Construct default objects by following the object's type with (). If you
+want an attribute to have a value different from its default, include
+the attribute name and value inside the parentheses.
+    >>> Process()
+    Process(daemon=False, max_failures=1, ephemeral=False,
+      min_duration=5, final=False)
+Attribute values can be template variables, which then receive specific
+values when creating the object.
+    >>> Process(cmdline = 'echo {{message}}')
+    Process(daemon=False, max_failures=1, ephemeral=False, min_duration=5,
+            cmdline=echo {{message}}, final=False)
+    >>> Process(cmdline = 'echo {{message}}').bind(message = 'hello world')
+    Process(daemon=False, max_failures=1, ephemeral=False, min_duration=5,
+            cmdline=echo hello world, final=False)
+A powerful binding property is that all of an object's children inherit its
+    >>> List(Process)([
+    ... Process(name = '{{prefix}}_one'),
+    ... Process(name = '{{prefix}}_two')
+    ... ]).bind(prefix = 'hello')
+    ProcessList(
+      Process(daemon=False, name=hello_one, max_failures=1, ephemeral=False, min_duration=5, final=False),
+      Process(daemon=False, name=hello_two, max_failures=1, ephemeral=False, min_duration=5, final=False)
+      )
+Remember that an Aurora Job contains Tasks which contain Processes. A
+Job level binding is inherited by its Tasks and all their Processes.
+Similarly a Task level binding is available to that Task and its
+Processes but is *not* visible at the Job level (inheritance is a
+one-way street.)
+#### Mustaches Within Structurals
+When you define a `Struct` schema, one powerful, but confusing, feature
+is that all of that structure's attributes are Mustache variables within
+the enclosing scope *once they have been populated*.
+For example, when `Process` is defined above, all its attributes such as
+{{`name`}}, {{`cmdline`}}, {{`max_failures`}} etc., are all immediately
+defined as Mustache variables, implicitly bound into the `Process`, and
+inherit all child objects once they are defined.
+Thus, you can do the following:
+    >>> Process(name = "installer", cmdline = "echo {{name}} is running")
+    Process(daemon=False, name=installer, max_failures=1, ephemeral=False, min_duration=5,
+            cmdline=echo installer is running, final=False)
+WARNING: This binding only takes place in one direction. For example,
+the following does NOT work and does not set the `Process` `name`
+attribute's value.
+    >>> Process().bind(name = "installer")
+    Process(daemon=False, max_failures=1, ephemeral=False, min_duration=5, final=False)
+The following is also not possible and results in an infinite loop that
+attempts to resolve ``.
+    >>> Process(name = '{{name}}').bind(name = 'installer')
+Do not confuse Structural attributes with bound Mustache variables.
+Attributes are implicitly converted to Mustache variables but not vice
+### Templating 2: Structurals Are Factories
+#### A Second Way of Templating
+A second templating method is both as powerful as the aforementioned and
+often confused with it. This method is due to automatic conversion of
+Struct attributes to Mustache variables as described above.
+Suppose you create a Process object:
+    >>> p = Process(name = "process_one", cmdline = "echo hello world")
+    >>> p
+    Process(daemon=False, name=process_one, max_failures=1, ephemeral=False, min_duration=5,
+            cmdline=echo hello world, final=False)
+This `Process` object, "`p`", can be used wherever a `Process` object is
+needed. It can also be reused by changing the value(s) of its
+attribute(s). Here we change its `name` attribute from `process_one` to
+    >>> p(name = "process_two")
+    Process(daemon=False, name=process_two, max_failures=1, ephemeral=False, min_duration=5,
+            cmdline=echo hello world, final=False)
+Template creation is a common use for this technique:
+    >>> Daemon = Process(daemon = True)
+    >>> logrotate = Daemon(name = 'logrotate', cmdline = './logrotate conf/logrotate.conf')
+    >>> mysql = Daemon(name = 'mysql', cmdline = 'bin/mysqld --safe-mode')
+### Advanced Binding
+As described above, `.bind()` binds simple strings or numbers to
+Mustache variables. In addition to Structural types formed by combining
+atomic types, Pystachio has two container types; `List` and `Map` which
+can also be bound via `.bind()`.
+#### Bind Syntax
+The `bind()` function can take Python dictionaries or `kwargs`
+interchangeably (when "`kwargs`" is in a function definition, `kwargs`
+receives a Python dictionary containing all keyword arguments after the
+formal parameter list).
+    >>> String('{{foo}}').bind(foo = 'bar') == String('{{foo}}').bind({'foo': 'bar'})
+    True
+Bindings done "closer" to the object in question take precedence:
+    >>> p = Process(name = '{{context}}_process')
+    >>> t = Task().bind(context = 'global')
+    >>> t(processes = [p, p.bind(context = 'local')])
+    Task(processes=ProcessList(
+      Process(daemon=False, name=global_process, max_failures=1, ephemeral=False, final=False,
+              min_duration=5),
+      Process(daemon=False, name=local_process, max_failures=1, ephemeral=False, final=False,
+              min_duration=5)
+    ))
+#### Binding Complex Objects
+##### Lists
+    >>> fibonacci = List(Integer)([1, 1, 2, 3, 5, 8, 13])
+    >>> String('{{fib[4]}}').bind(fib = fibonacci)
+    String(5)
+##### Maps
+    >>> first_names = Map(String, String)({'Kent': 'Clark', 'Wayne': 'Bruce', 'Prince': 'Diana'})
+    >>> String('{{first[Kent]}}').bind(first = first_names)
+    String(Clark)
+##### Structurals
+    >>> String('{{p.cmdline}}').bind(p = Process(cmdline = "echo hello world"))
+    String(echo hello world)
+### Structural Binding
+Use structural templates when binding more than two or three individual
+values at the Job or Task level. For fewer than two or three, standard
+key to string binding is sufficient.
+Structural binding is a very powerful pattern and is most useful in
+Aurora/Thermos for doing Structural configuration. For example, you can
+define a job profile. The following profile uses `HDFS`, the Hadoop
+Distributed File System, to designate a file's location. `HDFS` does
+not come with Aurora, so you'll need to either install it separately
+or change the way the dataset is designated.
+    class Profile(Struct):
+      version = Required(String)
+      environment = Required(String)
+      dataset = Default(String, hdfs://home/aurora/data/{{environment}}')
+    PRODUCTION = Profile(version = 'live', environment = 'prod')
+    DEVEL = Profile(version = 'latest',
+                    environment = 'devel',
+                    dataset = 'hdfs://home/aurora/data/test')
+    TEST = Profile(version = 'latest', environment = 'test')
+    JOB_TEMPLATE = Job(
+      name = 'application',
+      role = 'myteam',
+      cluster = 'cluster1',
+      environment = '{{profile.environment}}',
+      task = SequentialTask(
+        name = 'task',
+        resources = Resources(cpu = 2, ram = 4*GB, disk = 8*GB),
+        processes = [
+      Process(name = 'main', cmdline = 'java -jar application.jar -hdfsPath
+                 {{profile.dataset}}')
+        ]
+       )
+     )
+    jobs = [
+      JOB_TEMPLATE(instances = 100).bind(profile = PRODUCTION),
+      JOB_TEMPLATE.bind(profile = DEVEL),
+      JOB_TEMPLATE.bind(profile = TEST),
+     ]
+In this case, a custom structural "Profile" is created to self-document
+the configuration to some degree. This also allows some schema
+"type-checking", and for default self-substitution, e.g. in
+`Profile.dataset` above.
+So rather than a `.bind()` with a half-dozen substituted variables, you
+can bind a single object that has sensible defaults stored in a single

Added: aurora/site/source/documentation/0.18.0/reference/
--- aurora/site/source/documentation/0.18.0/reference/ (added)
+++ aurora/site/source/documentation/0.18.0/reference/ Wed Jun 21 06:36:21 2017
@@ -0,0 +1,531 @@
+Aurora Configuration Tutorial
+How to write Aurora configuration files, including feature descriptions
+and best practices. When writing a configuration file, make use of
+`aurora job inspect`. It takes the same job key and configuration file
+arguments as `aurora job create` or `aurora update start`. It first ensures the
+configuration parses, then outputs it in human-readable form.
+You should read this after going through the general [Aurora Tutorial](../../getting-started/tutorial/).
+- [The Basics](#the-basics)
+	- [Use Bottom-To-Top Object Ordering](#use-bottom-to-top-object-ordering)
+- [An Example Configuration File](#an-example-configuration-file)
+- [Defining Process Objects](#defining-process-objects)
+- [Getting Your Code Into The Sandbox](#getting-your-code-into-the-sandbox)
+- [Defining Task Objects](#defining-task-objects)
+	- [SequentialTask: Running Processes in Parallel or Sequentially](#sequentialtask-running-processes-in-parallel-or-sequentially)
+	- [SimpleTask](#simpletask)
+	- [Combining tasks](#combining-tasks)
+- [Defining Job Objects](#defining-job-objects)
+- [The jobs List](#the-jobs-list)
+- [Basic Examples](#basic-examples)
+The Basics
+To run a job on Aurora, you must specify a configuration file that tells
+Aurora what it needs to know to schedule the job, what Mesos needs to
+run the tasks the job is made up of, and what Thermos needs to run the
+processes that make up the tasks. This file must have
+a`.aurora` suffix.
+A configuration file defines a collection of objects, along with parameter
+values for their attributes. An Aurora configuration file contains the
+following three types of objects:
+- Job
+- Task
+- Process
+A configuration also specifies a list of `Job` objects assigned
+to the variable `jobs`.
+- jobs (list of defined Jobs to run)
+The `.aurora` file format is just Python. However, `Job`, `Task`,
+`Process`, and other classes are defined by a type-checked dictionary
+templating library called *Pystachio*, a powerful tool for
+configuration specification and reuse. Pystachio objects are tailored
+via {{}} surrounded templates.
+When writing your `.aurora` file, you may use any Pystachio datatypes, as
+well as any objects shown in the [*Aurora Configuration
+Reference*](../configuration/), without `import` statements - the
+Aurora config loader injects them automatically. Other than that, an `.aurora`
+file works like any other Python script.
+[*Aurora Configuration Reference*](../configuration/)
+has a full reference of all Aurora/Thermos defined Pystachio objects.
+### Use Bottom-To-Top Object Ordering
+A well-structured configuration starts with structural templates (if
+any). Structural templates encapsulate in their attributes all the
+differences between Jobs in the configuration that are not directly
+manipulated at the `Job` level, but typically at the `Process` or `Task`
+level. For example, if certain processes are invoked with slightly
+different settings or input.
+After structural templates, define, in order, `Process`es, `Task`s, and
+Structural template names should be *UpperCamelCased* and their
+instantiations are typically *UPPER\_SNAKE\_CASED*. `Process`, `Task`,
+and `Job` names are typically *lower\_snake\_cased*. Indentation is typically 2
+An Example Configuration File
+The following is a typical configuration file. Don't worry if there are
+parts you don't understand yet, but you may want to refer back to this
+as you read about its individual parts. Note that names surrounded by
+curly braces {{}} are template variables, which the system replaces with
+bound values for the variables.
+    # --- templates here ---
+	class Profile(Struct):
+	  package_version = Default(String, 'live')
+	  java_binary = Default(String, '/usr/lib/jvm/java-1.7.0-openjdk/bin/java')
+	  extra_jvm_options = Default(String, '')
+	  parent_environment = Default(String, 'prod')
+	  parent_serverset = Default(String,
+                                 '/foocorp/service/bird/{{parent_environment}}/bird')
+	# --- processes here ---
+	main = Process(
+	  name = 'application',
+	  cmdline = '{{profile.java_binary}} -server -Xmx1792m '
+	            '{{profile.extra_jvm_options}} '
+	            '-jar application.jar '
+	            '-upstreamService {{profile.parent_serverset}}'
+	)
+	# --- tasks ---
+	base_task = SequentialTask(
+	  name = 'application',
+	  processes = [
+	    Process(
+	      name = 'fetch',
+	      cmdline = 'curl -O
+        {{profile.package_version}}/application.jar'),
+	  ]
+	)
+        # not always necessary but often useful to have separate task
+        # resource classes
+        staging_task = base_task(resources =
+                         Resources(cpu = 1.0,
+                                   ram = 2048*MB,
+                                   disk = 1*GB))
+	production_task = base_task(resources =
+                            Resources(cpu = 4.0,
+                                      ram = 2560*MB,
+                                      disk = 10*GB))
+	# --- job template ---
+	job_template = Job(
+	  name = 'application',
+	  role = 'myteam',
+	  contact = '',
+	  instances = 20,
+	  service = True,
+	  task = production_task
+	)
+	# -- profile instantiations (if any) ---
+	PRODUCTION = Profile()
+	STAGING = Profile(
+	  extra_jvm_options = '-Xloggc:gc.log',
+	  parent_environment = 'staging'
+	)
+	# -- job instantiations --
+	jobs = [
+          job_template(cluster = 'cluster1', environment = 'prod')
+	               .bind(profile = PRODUCTION),
+          job_template(cluster = 'cluster2', environment = 'prod')
+	                .bind(profile = PRODUCTION),
+          job_template(cluster = 'cluster1',
+                        environment = 'staging',
+			service = False,
+			task = staging_task,
+			instances = 2)
+			.bind(profile = STAGING),
+	]
+## Defining Process Objects
+Processes are handled by the Thermos system. A process is a single
+executable step run as a part of an Aurora task, which consists of a
+bash-executable statement.
+The key (and required) `Process` attributes are:
+-   `name`: Any string which is a valid Unix filename (no slashes,
+    NULLs, or leading periods). The `name` value must be unique relative
+    to other Processes in a `Task`.
+-   `cmdline`: A command line run in a bash subshell, so you can use
+    bash scripts. Nothing is supplied for command-line arguments,
+    so `$*` is unspecified.
+Many tiny processes make managing configurations more difficult. For
+example, the following is a bad way to define processes.
+    copy = Process(
+      name = 'copy',
+      cmdline = 'curl -O'
+    )
+    unpack = Process(
+      name = 'unpack',
+      cmdline = 'unzip'
+    )
+    remove = Process(
+      name = 'remove',
+      cmdline = 'rm -f'
+    )
+    run = Process(
+      name = 'app',
+      cmdline = 'java -jar app.jar'
+    )
+    run_task = Task(
+      processes = [copy, unpack, remove, run],
+      constraints = order(copy, unpack, remove, run)
+    )
+Since `cmdline` runs in a bash subshell, you can chain commands
+with `&&` or `||`.
+When defining a `Task` that is just a list of Processes run in a
+particular order, use `SequentialTask`, as described in the [*Defining*
+`Task` *Objects*](#Task) section. The following simplifies and combines the
+above multiple `Process` definitions into just two.
+    stage = Process(
+      name = 'stage',
+      cmdline = 'curl -O && '
+                'unzip && rm -f')
+    run = Process(name = 'app', cmdline = 'java -jar app.jar')
+    run_task = SequentialTask(processes = [stage, run])
+`Process` also has optional attributes to customize its behaviour. Details can be found in the [Aurora Configuration Reference](../configuration/#process-objects).
+## Getting Your Code Into The Sandbox
+When using Aurora, you need to get your executable code into its "sandbox", specifically
+the Task sandbox where the code executes for the Processes that make up that Task.
+Each Task has a sandbox created when the Task starts and garbage
+collected when it finishes. All of a Task's processes run in its
+sandbox, so processes can share state by using a shared current
+working directory.
+Typically, you save this code somewhere. You then need to define a Process
+in your `.aurora` configuration file that fetches the code from that somewhere
+to where the agent can see it. For a public cloud, that can be anywhere public on
+the Internet, such as S3. For a private cloud internal storage, you need to put in
+on an accessible HDFS cluster or similar storage.
+The template for this Process is:
+    <name> = Process(
+      name = '<name>'
+      cmdline = '<command to copy and extract code archive into current working directory>'
+    )
+Note: Be sure the extracted code archive has an executable.
+## Getting Environment Variables Into The Sandbox
+Every time a process is forked the Thermos executor checks for the existence of the
+`.thermos_profile` file, if the `.thermos_profile` file exists it will be sourced.
+You can utilize this process to pass environment variables to the sandbox.
+An example for this Process is:
+    setup_env = Process(
+      name = 'setup'
+      cmdline = '''cat <<EOF > .thermos_profile
+                   export RESULT=hello
+                   EOF'''
+    )
+    read_env = Process(
+      name = 'read'
+      cmdline = 'echo $RESULT'
+    )
+## Defining Task Objects
+Tasks are handled by Mesos. A task is a collection of processes that
+runs in a shared sandbox. It's the fundamental unit Aurora uses to
+schedule the datacenter; essentially what Aurora does is find places
+in the cluster to run tasks.
+The key (and required) parts of a Task are:
+-   `name`: A string giving the Task's name. By default, if a Task is
+    not given a name, it inherits the first name in its Process list.
+-   `processes`: An unordered list of Process objects bound to the Task.
+    The value of the optional `constraints` attribute affects the
+    contents as a whole. Currently, the only constraint, `order`, determines if
+    the processes run in parallel or sequentially.
+-   `resources`: A `Resource` object defining the Task's resource
+        footprint. A `Resource` object has three attributes:
+        -   `cpu`: A Float, the fractional number of cores the Task
+        requires.
+        -   `ram`: An Integer, RAM bytes the Task requires.
+        -   `disk`: An integer, disk bytes the Task requires.
+A basic Task definition looks like:
+    Task(
+        name="hello_world",
+        processes=[Process(name = "hello_world", cmdline = "echo hello world")],
+        resources=Resources(cpu = 1.0,
+                            ram = 1*GB,
+                            disk = 1*GB))
+A Task has optional attributes to customize its behaviour. Details can be found in the [Aurora Configuration Reference](../configuration/#task-object)
+### SequentialTask: Running Processes in Parallel or Sequentially
+By default, a Task with several Processes runs them in parallel. There
+are two ways to run Processes sequentially:
+-   Include an `order` constraint in the Task definition's `constraints`
+    attribute whose arguments specify the processes' run order:
+        Task( ... processes=[process1, process2, process3],
+	          constraints = order(process1, process2, process3), ...)
+-   Use `SequentialTask` instead of `Task`; it automatically runs
+    processes in the order specified in the `processes` attribute. No
+    `constraint` parameter is needed:
+        SequentialTask( ... processes=[process1, process2, process3] ...)
+### SimpleTask
+For quickly creating simple tasks, use the `SimpleTask` helper. It
+creates a basic task from a provided name and command line using a
+default set of resources. For example, in a .`aurora` configuration
+    SimpleTask(name="hello_world", command="echo hello world")
+is equivalent to
+    Task(name="hello_world",
+         processes=[Process(name = "hello_world", cmdline = "echo hello world")],
+         resources=Resources(cpu = 1.0,
+                             ram = 1*GB,
+                             disk = 1*GB))
+The simplest idiomatic Job configuration thus becomes:
+    import os
+    hello_world_job = Job(
+      task=SimpleTask(name="hello_world", command="echo hello world"),
+      role=os.getenv('USER'),
+      cluster="cluster1")
+When written to `hello_world.aurora`, you invoke it with a simple
+`aurora job create cluster1/$USER/test/hello_world hello_world.aurora`.
+### Combining tasks
+`Tasks.concat`(synonym,`concat_tasks`) and
+`Tasks.combine`(synonym,`combine_tasks`) merge multiple Task definitions
+into a single Task. It may be easier to define complex Jobs
+as smaller constituent Tasks. But since a Job only includes a single
+Task, the subtasks must be combined before using them in a Job.
+Smaller Tasks can also be reused between Jobs, instead of having to
+repeat their definition for multiple Jobs.
+With both methods, the merged Task takes the first Task's name. The
+difference between the two is the result Task's process ordering.
+-   `Tasks.combine` runs its subtasks' processes in no particular order.
+    The new Task's resource consumption is the sum of all its subtasks'
+    consumption.
+-   `Tasks.concat` runs its subtasks in the order supplied, with each
+    subtask's processes run serially between tasks. It is analogous to
+    the `order` constraint helper, except at the Task level instead of
+    the Process level. The new Task's resource consumption is the
+    maximum value specified by any subtask for each Resource attribute
+    (cpu, ram and disk).
+For example, given the following:
+    setup_task = Task(
+      ...
+      processes=[download_interpreter, update_zookeeper],
+      # It is important to note that {{Tasks.concat}} has
+      # no effect on the ordering of the processes within a task;
+      # hence the necessity of the {{order}} statement below
+      # (otherwise, the order in which {{download_interpreter}}
+      # and {{update_zookeeper}} run will be non-deterministic)
+      constraints=order(download_interpreter, update_zookeeper),
+      ...
+    )
+    run_task = SequentialTask(
+      ...
+      processes=[download_application, start_application],
+      ...
+    )
+    combined_task = Tasks.concat(setup_task, run_task)
+The `Tasks.concat` command merges the two Tasks into a single Task and
+ensures all processes in `setup_task` run before the processes
+in `run_task`. Conceptually, the task is reduced to:
+    task = Task(
+      ...
+      processes=[download_interpreter, update_zookeeper,
+                 download_application, start_application],
+      constraints=order(download_interpreter, update_zookeeper,
+                        download_application, start_application),
+      ...
+    )
+In the case of `Tasks.combine`, the two schedules run in parallel:
+    task = Task(
+      ...
+      processes=[download_interpreter, update_zookeeper,
+                 download_application, start_application],
+      constraints=order(download_interpreter, update_zookeeper) +
+                        order(download_application, start_application),
+      ...
+    )
+In the latter case, each of the two sequences may operate in parallel.
+Of course, this may not be the intended behavior (for example, if
+the `start_application` Process implicitly relies
+upon `download_interpreter`). Make sure you understand the difference
+between using one or the other.
+## Defining Job Objects
+A job is a group of identical tasks that Aurora can run in a Mesos cluster.
+A `Job` object is defined by the values of several attributes, some
+required and some optional. The required attributes are:
+-   `task`: Task object to bind to this job. Note that a Job can
+    only take a single Task.
+-   `role`: Job's role account; in other words, the user account to run
+    the job as on a Mesos cluster machine. A common value is
+    `os.getenv('USER')`; using a Python command to get the user who
+    submits the job request. The other common value is the service
+    account that runs the job, e.g. `www-data`.
+-   `environment`: Job's environment, typical values
+    are `devel`, `test`, or `prod`.
+-   `cluster`: Aurora cluster to schedule the job in, defined in
+    `/etc/aurora/clusters.json` or `~/.clusters.json`. You can specify
+    jobs where the only difference is the `cluster`, then at run time
+    only run the Job whose job key includes your desired cluster's name.
+You usually see a `name` parameter. By default, `name` inherits its
+value from the Job's associated Task object, but you can override this
+default. For these four parameters, a Job definition might look like:
+    foo_job = Job( name = 'foo', cluster = 'cluster1',
+              role = os.getenv('USER'), environment = 'prod',
+              task = foo_task)
+In addition to the required attributes, there are several optional
+attributes. Details can be found in the [Aurora Configuration Reference](../configuration/#job-objects).
+## The jobs List
+At the end of your `.aurora` file, you need to specify a list of the
+file's defined Jobs. For example, the following exports the jobs `job1`,
+`job2`, and `job3`.
+    jobs = [job1, job2, job3]
+This allows the aurora client to invoke commands on those jobs, such as
+starting, updating, or killing them.
+Basic Examples
+These are provided to give a basic understanding of simple Aurora jobs.
+### hello_world.aurora
+Put the following in a file named `hello_world.aurora`, substituting your own values
+for values such as `cluster`s.
+    import os
+    hello_world_process = Process(name = 'hello_world', cmdline = 'echo hello world')
+    hello_world_task = Task(
+      resources = Resources(cpu = 0.1, ram = 16 * MB, disk = 16 * MB),
+      processes = [hello_world_process])
+    hello_world_job = Job(
+      cluster = 'cluster1',
+      role = os.getenv('USER'),
+      task = hello_world_task)
+    jobs = [hello_world_job]
+Then issue the following commands to create and kill the job, using your own values for the job key.
+    aurora job create cluster1/$USER/test/hello_world hello_world.aurora
+    aurora job kill cluster1/$USER/test/hello_world
+### Environment Tailoring
+Put the following in a file named `hello_world_productionized.aurora`, substituting your own values
+for values such as `cluster`s.
+    include('hello_world.aurora')
+    production_resources = Resources(cpu = 1.0, ram = 512 * MB, disk = 2 * GB)
+    staging_resources = Resources(cpu = 0.1, ram = 32 * MB, disk = 512 * MB)
+    hello_world_template = hello_world(
+        name = "hello_world-{{cluster}}"
+        task = hello_world(resources=production_resources))
+    jobs = [
+      # production jobs
+      hello_world_template(cluster = 'cluster1', instances = 25),
+      hello_world_template(cluster = 'cluster2', instances = 15),
+      # staging jobs
+      hello_world_template(
+        cluster = 'local',
+        instances = 1,
+        task = hello_world(resources=staging_resources)),
+    ]
+Then issue the following commands to create and kill the job, using your own values for the job key
+    aurora job create cluster1/$USER/test/hello_world-cluster1 hello_world_productionized.aurora
+    aurora job kill cluster1/$USER/test/hello_world-cluster1
\ No newline at end of file

Added: aurora/site/source/documentation/0.18.0/reference/
--- aurora/site/source/documentation/0.18.0/reference/ (added)
+++ aurora/site/source/documentation/0.18.0/reference/ Wed Jun 21 06:36:21 2017
@@ -0,0 +1,614 @@
+Aurora Configuration Reference
+Don't know where to start? The Aurora configuration schema is very
+powerful, and configurations can become quite complex for advanced use
+For examples of simple configurations to get something up and running
+quickly, check out the [Tutorial](../../getting-started/tutorial/). When you feel comfortable with the basics, move
+on to the [Configuration Tutorial](../configuration-tutorial/) for more in-depth coverage of
+configuration design.
+- [Process Schema](#process-schema)
+    - [Process Objects](#process-objects)
+- [Task Schema](#task-schema)
+    - [Task Object](#task-object)
+    - [Constraint Object](#constraint-object)
+    - [Resource Object](#resource-object)
+- [Job Schema](#job-schema)
+    - [Job Objects](#job-objects)
+    - [UpdateConfig Objects](#updateconfig-objects)
+    - [HealthCheckConfig Objects](#healthcheckconfig-objects)
+    - [Announcer Objects](#announcer-objects)
+    - [Container Objects](#container)
+    - [LifecycleConfig Objects](#lifecycleconfig-objects)
+- [Specifying Scheduling Constraints](#specifying-scheduling-constraints)
+- [Template Namespaces](#template-namespaces)
+    - [mesos Namespace](#mesos-namespace)
+    - [thermos Namespace](#thermos-namespace)
+Process Schema
+Process objects consist of required `name` and `cmdline` attributes. You can customize Process
+behavior with its optional attributes. Remember, Processes are handled by Thermos.
+### Process Objects
+  **Attribute Name**  | **Type**    | **Description**
+  ------------------- | :---------: | ---------------------------------
+   **name**           | String      | Process name (Required)
+   **cmdline**        | String      | Command line (Required)
+   **max_failures**   | Integer     | Maximum process failures (Default: 1)
+   **daemon**         | Boolean     | When True, this is a daemon process. (Default: False)
+   **ephemeral**      | Boolean     | When True, this is an ephemeral process. (Default: False)
+   **min_duration**   | Integer     | Minimum duration between process restarts in seconds. (Default: 15)
+   **final**          | Boolean     | When True, this process is a finalizing one that should run last. (Default: False)
+   **logger**         | Logger      | Struct defining the log behavior for the process. (Default: Empty)
+#### name
+The name is any valid UNIX filename string (specifically no
+slashes, NULLs or leading periods). Within a Task object, each Process name
+must be unique.
+#### cmdline
+The command line run by the process. The command line is invoked in a bash
+subshell, so can involve fully-blown bash scripts. However, nothing is
+supplied for command-line arguments so `$*` is unspecified.
+#### max_failures
+The maximum number of failures (non-zero exit statuses) this process can
+have before being marked permanently failed and not retried. If a
+process permanently fails, Thermos looks at the failure limit of the task
+containing the process (usually 1) to determine if the task has
+failed as well.
+Setting `max_failures` to 0 makes the process retry
+indefinitely until it achieves a successful (zero) exit status.
+It retries at most once every `min_duration` seconds to prevent
+an effective denial of service attack on the coordinating Thermos scheduler.
+#### daemon
+By default, Thermos processes are non-daemon. If `daemon` is set to True, a
+successful (zero) exit status does not prevent future process runs.
+Instead, the process reinvokes after `min_duration` seconds.
+However, the maximum failure limit still applies. A combination of
+`daemon=True` and `max_failures=0` causes a process to retry
+indefinitely regardless of exit status. This should be avoided
+for very short-lived processes because of the accumulation of
+checkpointed state for each process run. When running in Mesos
+specifically, `max_failures` is capped at 100.
+#### ephemeral
+By default, Thermos processes are non-ephemeral. If `ephemeral` is set to
+True, the process' status is not used to determine if its containing task
+has completed. For example, consider a task with a non-ephemeral
+webserver process and an ephemeral logsaver process
+that periodically checkpoints its log files to a centralized data store.
+The task is considered finished once the webserver process has
+completed, regardless of the logsaver's current status.
+#### min_duration
+Processes may succeed or fail multiple times during a single task's
+duration. Each of these is called a *process run*. `min_duration` is
+the minimum number of seconds the scheduler waits before running the
+same process.
+#### final
+Processes can be grouped into two classes: ordinary processes and
+finalizing processes. By default, Thermos processes are ordinary. They
+run as long as the task is considered healthy (i.e., no failure
+limits have been reached.) But once all regular Thermos processes
+finish or the task reaches a certain failure threshold, it
+moves into a "finalization" stage and runs all finalizing
+processes. These are typically processes necessary for cleaning up the
+task, such as log checkpointers, or perhaps e-mail notifications that
+the task completed.
+Finalizing processes may not depend upon ordinary processes or
+vice-versa, however finalizing processes may depend upon other
+finalizing processes and otherwise run as a typical process
+#### logger
+The default behavior of Thermos is to store stderr/stdout logs in files which grow unbounded.
+In the event that you have large log volume, you may want to configure Thermos to automatically
+rotate logs after they grow to a certain size, which can prevent your job from using more than its
+allocated disk space.
+Logger objects specify a `destination` for Process logs which is, by default, `file` - a pair of
+`stdout` and `stderr` files. Its also possible to specify `console` to get logs output to
+the Process stdout and stderr streams, `none` to suppress any logs output or `both` to send logs to
+files and console streams.
+The default Logger `mode` is `standard` which lets the stdout and stderr streams grow without bound.
+  **Attribute Name**  | **Type**          | **Description**
+  ------------------- | :---------------: | ---------------------------------
+   **destination**    | LoggerDestination | Destination of logs. (Default: `file`)
+   **mode**           | LoggerMode        | Mode of the logger. (Default: `standard`)
+   **rotate**         | RotatePolicy      | An optional rotation policy. (Default: `Empty`)
+A RotatePolicy describes log rotation behavior for when `mode` is set to `rotate` and it is ignored
+otherwise. If `rotate` is `Empty` or `RotatePolicy()` when the `mode` is set to `rotate` the
+defaults below are used.
+  **Attribute Name**  | **Type**     | **Description**
+  ------------------- | :----------: | ---------------------------------
+   **log_size**       | Integer      | Maximum size (in bytes) of an individual log file. (Default: 100 MiB)
+   **backups**        | Integer      | The maximum number of backups to retain. (Default: 5)
+An example process configuration is as follows:
+        process = Process(
+          name='process',
+          logger=Logger(
+            destination=LoggerDestination('both'),
+            mode=LoggerMode('rotate'),
+            rotate=RotatePolicy(log_size=5*MB, backups=5)
+          )
+        )
+Task Schema
+Tasks fundamentally consist of a `name` and a list of Process objects stored as the
+value of the `processes` attribute. Processes can be further constrained with
+`constraints`. By default, `name`'s value inherits from the first Process in the
+`processes` list, so for simple `Task` objects with one Process, `name`
+can be omitted. In Mesos, `resources` is also required.
+### Task Object
+   **param**               | **type**                         | **description**
+   ---------               | :---------:                      | ---------------
+   ```name```              | String                           | Process name (Required) (Default: ``````)
+   ```processes```         | List of ```Process``` objects    | List of ```Process``` objects bound to this task. (Required)
+   ```constraints```       | List of ```Constraint``` objects | List of ```Constraint``` objects constraining processes.
+   ```resources```         | ```Resource``` object            | Resource footprint. (Required)
+   ```max_failures```      | Integer                          | Maximum process failures before being considered failed (Default: 1)
+   ```max_concurrency```   | Integer                          | Maximum number of concurrent processes (Default: 0, unlimited concurrency.)
+   ```finalization_wait``` | Integer                          | Amount of time allocated for finalizing processes, in seconds. (Default: 30)
+#### name
+`name` is a string denoting the name of this task. It defaults to the name of the first Process in
+the list of Processes associated with the `processes` attribute.
+#### processes
+`processes` is an unordered list of `Process` objects. To constrain the order
+in which they run, use `constraints`.
+##### constraints
+A list of `Constraint` objects. Currently it supports only one type,
+the `order` constraint. `order` is a list of process names
+that should run in the order given. For example,
+        process = Process(cmdline = "echo hello {{name}}")
+        task = Task(name = "echoes",
+                    processes = [process(name = "jim"), process(name = "bob")],
+                    constraints = [Constraint(order = ["jim", "bob"]))
+Constraints can be supplied ad-hoc and in duplicate. Not all
+Processes need be constrained, however Tasks with cycles are
+rejected by the Thermos scheduler.
+Use the `order` function as shorthand to generate `Constraint` lists.
+The following:
+        order(process1, process2)
+is shorthand for
+        [Constraint(order = [,])]
+The `order` function accepts Process name strings `('foo', 'bar')` or the processes
+themselves, e.g. `foo=Process(name='foo', ...)`, `bar=Process(name='bar', ...)`,
+`constraints=order(foo, bar)`.
+#### resources
+Takes a `Resource` object, which specifies the amounts of CPU, memory, and disk space resources
+to allocate to the Task.
+#### max_failures
+`max_failures` is the number of failed processes needed for the `Task` to be
+marked as failed.
+For example, assume a Task has two Processes and a `max_failures` value of `2`:
+        template = Process(max_failures=10)
+        task = Task(
+          name = "fail",
+          processes = [
+             template(name = "failing", cmdline = "exit 1"),
+             template(name = "succeeding", cmdline = "exit 0")
+          ],
+          max_failures=2)
+The `failing` Process could fail 10 times before being marked as permanently
+failed, and the `succeeding` Process could succeed on the first run. However,
+the task would succeed despite only allowing for two failed processes. To be more
+specific, there would be 10 failed process runs yet 1 failed process. Both processes
+would have to fail for the Task to fail.
+#### max_concurrency
+For Tasks with a number of expensive but otherwise independent
+processes, you may want to limit the amount of concurrency
+the Thermos scheduler provides rather than artificially constraining
+it via `order` constraints. For example, a test framework may
+generate a task with 100 test run processes, but wants to run it on
+a machine with only 4 cores. You can limit the amount of parallelism to
+4 by setting `max_concurrency=4` in your task configuration.
+For example, the following task spawns 180 Processes ("mappers")
+to compute individual elements of a 180 degree sine table, all dependent
+upon one final Process ("reducer") to tabulate the results:
+    def make_mapper(id):
+      return Process(
+        name = "mapper%03d" % id,
+        cmdline = "echo 'scale=50;s(%d\*4\*a(1)/180)' | bc -l >
+                   temp.sine_table.%03d" % (id, id))
+    def make_reducer():
+      return Process(name = "reducer", cmdline = "cat temp.\* | nl \> sine\_table.txt
+                     && rm -f temp.\*")
+    processes = map(make_mapper, range(180))
+    task = Task(
+      name = "mapreduce",
+      processes = processes + [make\_reducer()],
+      constraints = [Constraint(order = [, 'reducer']) for mapper
+                     in processes],
+      max_concurrency = 8)
+#### finalization_wait
+Process execution is organizued into three active stages: `ACTIVE`,
+`CLEANING`, and `FINALIZING`. The `ACTIVE` stage is when ordinary processes run.
+This stage lasts as long as Processes are running and the Task is healthy.
+The moment either all Processes have finished successfully or the Task has reached a
+maximum Process failure limit, it goes into `CLEANING` stage and send
+SIGTERMs to all currently running Processes and their process trees.
+Once all Processes have terminated, the Task goes into `FINALIZING` stage
+and invokes the schedule of all Processes with the "final" attribute set to True.
+This whole process from the end of `ACTIVE` stage to the end of `FINALIZING`
+must happen within `finalization_wait` seconds. If it does not
+finish during that time, all remaining Processes are sent SIGKILLs
+(or if they depend upon uncompleted Processes, are
+never invoked.)
+When running on Aurora, the `finalization_wait` is capped at 60 seconds.
+### Constraint Object
+Current constraint objects only support a single ordering constraint, `order`,
+which specifies its processes run sequentially in the order given. By
+default, all processes run in parallel when bound to a `Task` without
+ordering constraints.
+   param | type           | description
+   ----- | :----:         | -----------
+   order | List of String | List of processes by name (String) that should be run serially.
+### Resource Object
+Specifies the amount of CPU, Ram, and disk resources the task needs. See the
+[Resource Isolation document](../../features/resource-isolation/) for suggested values and to understand how
+resources are allocated.
+  param      | type    | description
+  -----      | :----:  | -----------
+  ```cpu```  | Float   | Fractional number of cores required by the task.
+  ```ram```  | Integer | Bytes of RAM required by the task.
+  ```disk``` | Integer | Bytes of disk required by the task.
+  ```gpu```  | Integer | Number of GPU cores required by the task
+Job Schema
+### Job Objects
+*Note: Specifying a ```Container``` object as the value of the ```container``` property is
+  deprecated in favor of setting its value directly to the appropriate ```Docker``` or ```Mesos```
+  container type*
+*Note: Specifying preemption behavior of tasks through `production` flag is deprecated in favor of
+  electing appropriate task tier via `tier` attribute.*
+   name | type | description
+   ------ | :-------: | -------
+  ```task``` | Task | The Task object to bind to this job. Required.
+  ```name``` | String | Job name. (Default: inherited from the task attribute's name)
+  ```role``` | String | Job role account. Required.
+  ```cluster``` | String | Cluster in which this job is scheduled. Required.
+  ```environment``` | String | Job environment, default ```devel```. Must be one of ```prod```, ```devel```, ```test``` or ```staging<number>```.
+  ```contact``` | String | Best email address to reach the owner of the job. For production jobs, this is usually a team mailing list.
+  ```instances```| Integer | Number of instances (sometimes referred to as replicas or shards) of the task to create. (Default: 1)
+  ```cron_schedule``` | String | Cron schedule in cron format. May only be used with non-service jobs. See [Cron Jobs](../../features/cron-jobs/) for more information. Default: None (not a cron job.)
+  ```cron_collision_policy``` | String | Policy to use when a cron job is triggered while a previous run is still active. KILL_EXISTING Kill the previous run, and schedule the new run CANCEL_NEW Let the previous run continue, and cancel the new run. (Default: KILL_EXISTING)
+  ```update_config``` | ```UpdateConfig``` object | Parameters for controlling the rate and policy of rolling updates.
+  ```constraints``` | dict | Scheduling constraints for the tasks. See the section on the [constraint specification language](#specifying-scheduling-constraints)
+  ```service``` | Boolean | If True, restart tasks regardless of success or failure. (Default: False)
+  ```max_task_failures``` | Integer | Maximum number of failures after which the task is considered to have failed (Default: 1) Set to -1 to allow for infinite failures
+  ```priority``` | Integer | Preemption priority to give the task (Default 0). Tasks with higher priorities may preempt tasks at lower priorities.
+  ```production``` | Boolean |  (Deprecated) Whether or not this is a production task that may [preempt](../../features/multitenancy/#preemption) other tasks (Default: False). Production job role must have the appropriate [quota](../../features/multitenancy/#preemption).
+  ```health_check_config``` | ```HealthCheckConfig``` object | Parameters for controlling a task's health checks. HTTP health check is only used if a  health port was assigned with a command line wildcard.
+  ```container``` | Choice of ```Container```, ```Docker``` or ```Mesos``` object | An optional container to run all processes inside of.
+  ```lifecycle``` | ```LifecycleConfig``` object | An optional task lifecycle configuration that dictates commands to be executed on startup/teardown.  HTTP lifecycle is enabled by default if the "health" port is requested.  See [LifecycleConfig Objects](#lifecycleconfig-objects) for more information.
+  ```tier``` | String | Task tier type. The default scheduler tier configuration allows for 3 tiers: `revocable`, `preemptible`, and `preferred`. If a tier is not elected, Aurora assigns the task to a tier based on its choice of `production` (that is `preferred` for production and `preemptible` for non-production jobs). See the section on [Configuration Tiers](../../features/multitenancy/#configuration-tiers) for more information.
+  ```announce``` | ```Announcer``` object | Optionally enable Zookeeper ServerSet announcements. See [Announcer Objects] for more information.
+  ```enable_hooks``` | Boolean | Whether to enable [Client Hooks](../client-hooks/) for this job. (Default: False)
+### UpdateConfig Objects
+Parameters for controlling the rate and policy of rolling updates.
+| object                       | type     | description
+| ---------------------------- | :------: | ------------
+| ```batch_size```             | Integer  | Maximum number of shards to be updated in one iteration (Default: 1)
+| ```watch_secs```             | Integer  | Minimum number of seconds a shard must remain in ```RUNNING``` state before considered a success (Default: 45)
+| ```max_per_shard_failures``` | Integer  | Maximum number of restarts per shard during update. Increments total failure count when this limit is exceeded. (Default: 0)
+| ```max_total_failures```     | Integer  | Maximum number of shard failures to be tolerated in total during an update. Cannot be greater than or equal to the total number of tasks in a job. (Default: 0)
+| ```rollback_on_failure```    | boolean  | When False, prevents auto rollback of a failed update (Default: True)
+| ```wait_for_batch_completion```| boolean | When True, all threads from a given batch will be blocked from picking up new instances until the entire batch is updated. This essentially simulates the legacy sequential updater algorithm. (Default: False)
+| ```pulse_interval_secs```    | Integer  |  Indicates a [coordinated update](../../features/job-updates/#coordinated-job-updates). If no pulses are received within the provided interval the update will be blocked. Beta-updater only. Will fail on submission when used with client updater. (Default: None)
+### HealthCheckConfig Objects
+Parameters for controlling a task's health checks via HTTP or a shell command.
+| param                          | type      | description
+| -------                        | :-------: | --------
+| ```health_checker```           | HealthCheckerConfig | Configure what kind of health check to use.
+| ```initial_interval_secs```    | Integer   | Initial grace period (during which health-check failures are ignored) while performing health checks. (Default: 15)
+| ```interval_secs```            | Integer   | Interval on which to check the task's health. (Default: 10)
+| ```max_consecutive_failures``` | Integer   | Maximum number of consecutive failures that will be tolerated before considering a task unhealthy (Default: 0)
+| ```min_consecutive_successes``` | Integer   | Minimum number of consecutive successful health checks required before considering a task healthy (Default: 1)
+| ```timeout_secs```             | Integer   | Health check timeout. (Default: 1)
+### HealthCheckerConfig Objects
+| param                          | type                | description
+| -------                        | :-------:           | --------
+| ```http```                     | HttpHealthChecker  | Configure health check to use HTTP. (Default)
+| ```shell```                    | ShellHealthChecker | Configure health check via a shell command.
+### HttpHealthChecker Objects
+| param                          | type      | description
+| -------                        | :-------: | --------
+| ```endpoint```                 | String    | HTTP endpoint to check (Default: /health)
+| ```expected_response```        | String    | If not empty, fail the HTTP health check if the response differs. Case insensitive. (Default: ok)
+| ```expected_response_code```   | Integer   | If not zero, fail the HTTP health check if the response code differs. (Default: 0)
+### ShellHealthChecker Objects
+| param                          | type      | description
+| -------                        | :-------: | --------
+| ```shell_command```            | String    | An alternative to HTTP health checking. Specifies a shell command that will be executed. Any non-zero exit status will be interpreted as a health check failure.
+### Announcer Objects
+If the `announce` field in the Job configuration is set, each task will be
+registered in the ServerSet `/aurora/role/environment/jobname` in the
+zookeeper ensemble configured by the executor (which can be optionally overriden by specifying
+`zk_path` parameter).  If no Announcer object is specified,
+no announcement will take place.  For more information about ServerSets, see the [Service Discover](../../features/service-discovery/)
+By default, the hostname in the registered endpoints will be the `--hostname` parameter
+that is passed to the mesos agent. To override the hostname value, the executor can be started
+with `--announcer-hostname=<overriden_value>`. If you decide to use `--announcer-hostname` and if
+the overriden value needs to change for every executor, then the executor has to be started inside a wrapper, see [Executor Wrapper](../../operations/configuration/#thermos-executor-wrapper).
+For example, if you want the hostname in the endpoint to be an IP address instead of the hostname,
+the `--hostname` parameter to the mesos agent can be set to the machine IP or the executor can
+be started with `--announcer-hostname=<host_ip>` while wrapping the executor inside a script.
+| object                         | type      | description
+| -------                        | :-------: | --------
+| ```primary_port```             | String    | Which named port to register as the primary endpoint in the ServerSet (Default: `http`)
+| ```portmap```                  | dict      | A mapping of additional endpoints to be announced in the ServerSet (Default: `{ 'aurora': '{{primary_port}}' }`)
+| ```zk_path```                  | String    | Zookeeper serverset path override (executor must be started with the `--announcer-allow-custom-serverset-path` parameter)
+#### Port aliasing with the Announcer `portmap`
+The primary endpoint registered in the ServerSet is the one allocated to the port
+specified by the `primary_port` in the `Announcer` object, by default
+the `http` port.  This port can be referenced from anywhere within a configuration
+as `{{thermos.ports[http]}}`.
+Without the port map, each named port would be allocated a unique port number.
+The `portmap` allows two different named ports to be aliased together.  The default
+`portmap` aliases the `aurora` port (i.e. `{{thermos.ports[aurora]}}`) to
+the `http` port.  Even though the two ports can be referenced independently,
+only one port is allocated by Mesos.  Any port referenced in a `Process` object
+but which is not in the portmap will be allocated dynamically by Mesos and announced as well.
+It is possible to use the portmap to alias names to static port numbers, e.g.
+`{'http': 80, 'https': 443, 'aurora': 'http'}`.  In this case, referencing
+`{{thermos.ports[aurora]}}` would look up `{{thermos.ports[http]}}` then
+find a static port 80.  No port would be requested of or allocated by Mesos.
+Static ports should be used cautiously as Aurora does nothing to prevent two
+tasks with the same static port allocations from being co-scheduled.
+External constraints such as agent attributes should be used to enforce such
+guarantees should they be needed.
+### Container Objects
+Describes the container the job's processes will run inside. If not using Docker or the Mesos
+unified-container, the container can be omitted from your job config.
+  param          | type           | description
+  -----          | :----:         | -----------
+  ```mesos```    | Mesos          | A native Mesos container to use.
+  ```docker```   | Docker         | A Docker container to use (via Docker engine)
+### Mesos Object
+  param            | type                           | description
+  -----            | :----:                         | -----------
+  ```image```      | Choice(AppcImage, DockerImage) | An optional filesystem image to use within this container.
+  ```volumes```    | List(Volume)                   | An optional list of volume mounts for this container.
+### Volume Object
+  param                  | type     | description
+  -----                  | :----:   | -----------
+  ```container_path```   | String   | Path on the host to mount.
+  ```volume_path```      | String   | Mount point in the container.
+  ```mode```             | Enum     | Mode of the mount, can be 'RW' or 'RO'.
+### AppcImage
+Describes an AppC filesystem image.
+  param          | type   | description
+  -----          | :----: | -----------
+  ```name```     | String | The name of the appc image.
+  ```image_id``` | String | The [image id]( of the appc image.
+### DockerImage
+Describes a Docker filesystem image.
+  param      | type   | description
+  -----      | :----: | -----------
+  ```name``` | String | The name of the docker image.
+  ```tag```  | String | The tag that identifies the docker image.
+### Docker Object
+*Note: In order to correctly execute processes inside a job, the Docker container must have Python 2.7 installed.*
+*Note: For private docker registry, mesos mandates the docker credential file to be named as `.dockercfg`, even though docker may create a credential file with a different name on various platforms. Also, the `.dockercfg` file needs to be copied into the sandbox using the `-thermos_executor_resources` flag, specified while starting Aurora.*
+  param            | type            | description
+  -----            | :----:          | -----------
+  ```image```      | String          | The name of the docker image to execute.  If the image does not exist locally it will be pulled with ```docker pull```.
+  ```parameters``` | List(Parameter) | Additional parameters to pass to the Docker engine.
+### Docker Parameter Object
+Docker CLI parameters. This needs to be enabled by the scheduler `-allow_docker_parameters` option.
+See [Docker Command Line Reference]( for valid parameters.
+  param            | type            | description
+  -----            | :----:          | -----------
+  ```name```       | String          | The name of the docker parameter. E.g. volume
+  ```value```      | String          | The value of the parameter. E.g. /usr/local/bin:/usr/bin:rw
+### LifecycleConfig Objects
+*Note: The only lifecycle configuration supported is the HTTP lifecycle via the HttpLifecycleConfig.*
+  param          | type                | description
+  -----          | :----:              | -----------
+  ```http```     | HttpLifecycleConfig | Configure the lifecycle manager to send lifecycle commands to the task via HTTP.
+### HttpLifecycleConfig Objects
+  param          | type            | description
+  -----          | :----:          | -----------
+  ```port```     | String          | The named port to send POST commands (Default: health)
+  ```graceful_shutdown_endpoint``` | String | Endpoint to hit to indicate that a task should gracefully shutdown. (Default: /quitquitquit)
+  ```shutdown_endpoint``` | String | Endpoint to hit to give a task its final warning before being killed. (Default: /abortabortabort)
+#### graceful_shutdown_endpoint
+If the Job is listening on the port as specified by the HttpLifecycleConfig
+(default: `health`), a HTTP POST request will be sent over localhost to this
+endpoint to request that the task gracefully shut itself down.  This is a
+courtesy call before the `shutdown_endpoint` is invoked a fixed amount of
+time later.
+#### shutdown_endpoint
+If the Job is listening on the port as specified by the HttpLifecycleConfig
+(default: `health`), a HTTP POST request will be sent over localhost to this
+endpoint to request as a final warning before being shut down.  If the task
+does not shut down on its own after this, it will be forcefully killed
+Specifying Scheduling Constraints
+In the `Job` object there is a map `constraints` from String to String
+allowing the user to tailor the schedulability of tasks within the job.
+The constraint map's key value is the attribute name in which we
+constrain Tasks within our Job. The value is how we constrain them.
+There are two types of constraints: *limit constraints* and *value
+| constraint    | description
+| ------------- | --------------
+| Limit         | A string that specifies a limit for a constraint. Starts with <code>'limit:</code> followed by an Integer and closing single quote, such as ```'limit:1'```.
+| Value         | A string that specifies a value for a constraint. To include a list of values, separate the values using commas. To negate the values of a constraint, start with a ```!``` ```.```
+Further details can be found in the [Scheduling Constraints](../../features/constraints/) feature
+Template Namespaces
+Currently, a few Pystachio namespaces have special semantics. Using them
+in your configuration allow you to tailor application behavior
+through environment introspection or interact in special ways with the
+Aurora client or Aurora-provided services.
+### mesos Namespace
+The `mesos` namespace contains variables which relate to the `mesos` agent
+which launched the task. The `instance` variable can be used
+to distinguish between Task replicas.
+| variable name     | type       | description
+| --------------- | :--------: | -------------
+| ```instance```    | Integer    | The instance number of the created task. A job with 5 replicas has instance numbers 0, 1, 2, 3, and 4.
+| ```hostname``` | String | The instance hostname that the task was launched on.
+Please note, there is no uniqueness guarantee for `instance` in the presence of
+network partitions. If that is required, it should be baked in at the application
+level using a distributed coordination service such as Zookeeper.
+### thermos Namespace
+The `thermos` namespace contains variables that work directly on the
+Thermos platform in addition to Aurora. This namespace is fully
+compatible with Tasks invoked via the `thermos` CLI.
+| variable      | type                     | description                        |
+| :----------:  | ---------                | ------------                       |
+| ```ports```   | map of string to Integer | A map of names to port numbers     |
+| ```task_id``` | string                   | The task ID assigned to this task. |
+The `thermos.ports` namespace is automatically populated by Aurora when
+invoking tasks on Mesos. When running the `thermos` command directly,
+these ports must be explicitly mapped with the `-P` option.
+For example, if '{{`thermos.ports[http]`}}' is specified in a `Process`
+configuration, it is automatically extracted and auto-populated by
+Aurora, but must be specified with, for example, `thermos -P http:12345`
+to map `http` to port 12345 when running via the CLI.

Added: aurora/site/source/documentation/0.18.0/reference/
--- aurora/site/source/documentation/0.18.0/reference/ (added)
+++ aurora/site/source/documentation/0.18.0/reference/ Wed Jun 21 06:36:21 2017
@@ -0,0 +1,89 @@
+# Observer Configuration Reference
+The Aurora/Thermos observer can take a variety of configuration options through command-line arguments.
+A list of the available options can be seen by running `thermos_observer --long-help`.
+Please refer to the [Operator Configuration Guide](../../operations/configuration/) for details on how
+to properly set the most important options.
+$ thermos_observer.pex --long-help
+  -h, --help, --short-help
+                        show this help message and exit.
+  --long-help           show options from all registered modules, not just the
+                        __main__ module.
+  --mesos-root=MESOS_ROOT
+                        The mesos root directory to search for Thermos
+                        executor sandboxes [default: /var/lib/mesos]
+  --ip=IP               The IP address the observer will bind to. [default:
+              ]
+  --port=PORT           The port on which the observer should listen.
+                        [default: 1338]
+  --polling_interval_secs=POLLING_INTERVAL_SECS
+                        The number of seconds between observer refresh
+                        attempts. [default: 5]
+  --task_process_collection_interval_secs=TASK_PROCESS_COLLECTION_INTERVAL_SECS
+                        The number of seconds between per task process
+                        resource collections. [default: 20]
+  --task_disk_collection_interval_secs=TASK_DISK_COLLECTION_INTERVAL_SECS
+                        The number of seconds between per task disk resource
+                        collections. [default: 60]
+  From module
+    --app_daemonize     Daemonize this application. [default: False]
+    --app_profile_output=FILENAME
+                        Dump the profiling output to a binary profiling
+                        format. [default: None]
+    --app_daemon_stderr=TWITTER_COMMON_APP_DAEMON_STDERR
+                        Direct this app's stderr to this file if daemonized.
+                        [default: /dev/null]
+    --app_debug         Print extra debugging information during application
+                        initialization. [default: False]
+    --app_rc_filename   Print the filename for the rc file and quit. [default:
+                        False]
+    --app_daemon_stdout=TWITTER_COMMON_APP_DAEMON_STDOUT
+                        Direct this app's stdout to this file if daemonized.
+                        [default: /dev/null]
+    --app_profiling     Run profiler on the code while it runs.  Note this can
+                        cause slowdowns. [default: False]
+    --app_ignore_rc_file
+                        Ignore default arguments from the rc file. [default:
+                        False]
+                        The pidfile to use if --app_daemonize is specified.
+                        [default: None]
+  From module twitter.common.log.options:
+    --log_to_stdout=[scheme:]LEVEL
+                        OBSOLETE - legacy flag, use --log_to_stderr instead.
+                        [default: ERROR]
+    --log_to_stderr=[scheme:]LEVEL
+                        The level at which logging to stderr [default: ERROR].
+                        Takes either LEVEL or scheme:LEVEL, where LEVEL is one
+                        of ['INFO', 'NONE', 'WARN', 'ERROR', 'DEBUG', 'FATAL']
+                        and scheme is one of ['google', 'plain'].
+    --log_to_disk=[scheme:]LEVEL
+                        The level at which logging to disk [default: INFO].
+                        Takes either LEVEL or scheme:LEVEL, where LEVEL is one
+                        of ['INFO', 'NONE', 'WARN', 'ERROR', 'DEBUG', 'FATAL']
+                        and scheme is one of ['google', 'plain'].
+    --log_dir=DIR       The directory into which log files will be generated
+                        [default: /var/tmp].
+    --log_simple        Write a single log file rather than one log file per
+                        log level [default: False].
+    --log_to_scribe=[scheme:]LEVEL
+                        The level at which logging to scribe [default: NONE].
+                        Takes either LEVEL or scheme:LEVEL, where LEVEL is one
+                        of ['INFO', 'NONE', 'WARN', 'ERROR', 'DEBUG', 'FATAL']
+                        and scheme is one of ['google', 'plain'].
+    --scribe_category=CATEGORY
+                        The category used when logging to the scribe daemon.
+                        [default: python_default].
+    --scribe_buffer     Buffer messages when scribe is unavailable rather than
+                        dropping them. [default: False].
+    --scribe_host=HOST  The host running the scribe daemon. [default:
+                        localhost].
+    --scribe_port=PORT  The port used to connect to the scribe daemon.
+                        [default: 1463].

Added: aurora/site/source/documentation/0.18.0/reference/
--- aurora/site/source/documentation/0.18.0/reference/ (added)
+++ aurora/site/source/documentation/0.18.0/reference/ Wed Jun 21 06:36:21 2017
@@ -0,0 +1,268 @@
+# Scheduler Configuration Reference
+The Aurora scheduler can take a variety of configuration options through command-line arguments.
+A list of the available options can be seen by running `aurora-scheduler -help`.
+Please refer to the [Operator Configuration Guide](../../operations/configuration/) for details on how
+to properly set the most important options.
+$ aurora-scheduler -help
+-h or -help to print this help message
+Required flags:
+-backup_dir [not null]
+	Directory to store backups under. Will be created if it does not exist.
+-cluster_name [not null]
+	Name to identify the cluster being served.
+-db_max_active_connection_count [must be > 0]
+	Max number of connections to use with database via MyBatis
+-db_max_idle_connection_count [must be > 0]
+	Max number of idle connections to the database via MyBatis
+	Properties file which contains framework credentials to authenticate with Mesosmaster. Must contain the properties 'aurora_authentication_principal' and 'aurora_authentication_secret'.
+	The ip address to listen. If not set, the scheduler will listen on all interfaces.
+-mesos_master_address [not null]
+	Address for the mesos master, can be a socket address or zookeeper path.
+	The Mesos role this framework will register as. The default is to left this empty, and the framework will register without any role and only receive unreserved resources in offer.
+-serverset_path [not null, must be non-empty]
+	ZooKeeper ServerSet path to register at.
+	Fully qualified class name of the servlet filter to be applied after the shiro auth filters are applied.
+	Path to the thermos executor entry point.
+-tier_config [file must be readable]
+	Configuration file defining supported task tiers, task traits and behaviors.
+-webhook_config [file must exist, file must be readable]
+	Path to webhook configuration file.
+-zk_endpoints [must have at least 1 item]
+	Endpoint specification for the ZooKeeper servers.
+Optional flags:
+-allow_container_volumes (default false)
+	Allow passing in volumes in the job. Enabling this could pose a privilege escalation threat.
+-allow_docker_parameters (default false)
+	Allow to pass docker container parameters in the job.
+-allow_gpu_resource (default false)
+	Allow jobs to request Mesos GPU resource.
+-allowed_container_types (default [MESOS])
+	Container types that are allowed to be used by jobs.
+-async_slot_stat_update_interval (default (1, mins))
+	Interval on which to try to update open slot stats.
+-async_task_stat_update_interval (default (1, hrs))
+	Interval on which to try to update resource consumption stats.
+-async_worker_threads (default 8)
+	The number of worker threads to process async task operations with.
+-backup_interval (default (1, hrs))
+	Minimum interval on which to write a storage backup.
+-cron_scheduler_num_threads (default 10)
+	Number of threads to use for the cron scheduler thread pool.
+-cron_scheduling_max_batch_size (default 10) [must be > 0]
+	The maximum number of triggered cron jobs that can be processed in a batch.
+-cron_start_initial_backoff (default (5, secs))
+	Initial backoff delay while waiting for a previous cron run to be killed.
+-cron_start_max_backoff (default (1, mins))
+	Max backoff delay while waiting for a previous cron run to be killed.
+-cron_timezone (default GMT)
+	TimeZone to use for cron predictions.
+-custom_executor_config [file must exist, file must be readable]
+	Path to custom executor settings configuration file.
+-db_lock_timeout (default (1, mins))
+	H2 table lock timeout
+-db_row_gc_interval (default (2, hrs))
+	Interval on which to scan the database for unused row references.
+-default_docker_parameters (default {})
+	Default docker parameters for any job that does not explicitly declare parameters.
+-dlog_max_entry_size (default (512, KB))
+	Specifies the maximum entry size to append to the log. Larger entries will be split across entry Frames.
+-dlog_shutdown_grace_period (default (2, secs))
+	Specifies the maximum time to wait for scheduled checkpoint and snapshot actions to complete before forcibly shutting down.
+-dlog_snapshot_interval (default (1, hrs))
+	Specifies the frequency at which snapshots of local storage are taken and written to the log.
+	List of domains for which CORS support should be enabled.
+-enable_db_metrics (default true)
+	Whether to use MyBatis interceptor to measure the timing of intercepted Statements.
+-enable_h2_console (default false)
+	Enable H2 DB management console.
+-enable_mesos_fetcher (default false)
+	Allow jobs to pass URIs to the Mesos Fetcher. Note that enabling this feature could pose a privilege escalation threat.
+-enable_preemptor (default true)
+	Enable the preemptor and preemption
+-enable_revocable_cpus (default true)
+	Treat CPUs as a revocable resource.
+-enable_revocable_ram (default false)
+	Treat RAM as a revocable resource.
+-executor_user (default root)
+	User to start the executor. Defaults to "root". Set this to an unprivileged user if the mesos master was started with "--no-root_submissions". If set to anything other than "root", the executor will ignore the "role" setting for jobs since it can't use setuid() anymore. This means that all your jobs will run under the specified user and the user has to exist on the Mesos agents.
+-first_schedule_delay (default (1, ms))
+	Initial amount of time to wait before first attempting to schedule a PENDING task.
+-flapping_task_threshold (default (5, mins))
+	A task that repeatedly runs for less than this time is considered to be flapping.
+-framework_announce_principal (default false)
+	When 'framework_authentication_file' flag is set, the FrameworkInfo registered with the mesos master will also contain the principal. This is necessary if you intend to use mesos authorization via mesos ACLs. The default will change in a future release. Changing this value is backwards incompatible. For details, see MESOS-703.
+-framework_failover_timeout (default (21, days))
+	Time after which a framework is considered deleted.  SHOULD BE VERY HIGH.
+-framework_name (default Aurora)
+	Name used to register the Aurora framework with Mesos.
+-global_container_mounts (default [])
+	A comma separated list of mount points (in host:container form) to mount into all (non-mesos) containers.
+-history_max_per_job_threshold (default 100)
+	Maximum number of terminated tasks to retain in a job history.
+-history_min_retention_threshold (default (1, hrs))
+	Minimum guaranteed time for task history retention before any pruning is attempted.
+-history_prune_threshold (default (2, days))
+	Time after which the scheduler will prune terminated task history.
+	The hostname to advertise in ZooKeeper instead of the locally-resolved hostname.
+-http_authentication_mechanism (default NONE)
+	HTTP Authentication mechanism to use.
+-http_port (default 0)
+	The port to start an HTTP server on.  Default value will choose a random port.
+-initial_flapping_task_delay (default (30, secs))
+	Initial amount of time to wait before attempting to schedule a flapping task.
+-initial_schedule_penalty (default (1, secs))
+	Initial amount of time to wait before attempting to schedule a task that has failed to schedule.
+-initial_task_kill_retry_interval (default (5, secs))
+	When killing a task, retry after this delay if mesos has not responded, backing off up to transient_task_state_timeout
+-job_update_history_per_job_threshold (default 10)
+	Maximum number of completed job updates to retain in a job update history.
+-job_update_history_pruning_interval (default (15, mins))
+	Job update history pruning interval.
+-job_update_history_pruning_threshold (default (30, days))
+	Time after which the scheduler will prune completed job update history.
+-kerberos_debug (default false)
+	Produce additional Kerberos debugging output.
+	Path to the server keytab.
+	Kerberos server principal to use, usually of the form HTTP/
+-max_flapping_task_delay (default (5, mins))
+	Maximum delay between attempts to schedule a flapping task.
+-max_leading_duration (default (1, days))
+	After leading for this duration, the scheduler should commit suicide.
+-max_registration_delay (default (1, mins))
+	Max allowable delay to allow the driver to register before aborting
+-max_reschedule_task_delay_on_startup (default (30, secs))
+	Upper bound of random delay for pending task rescheduling on scheduler startup.
+-max_saved_backups (default 48)
+	Maximum number of backups to retain before deleting the oldest backups.
+-max_schedule_attempts_per_sec (default 40.0)
+	Maximum number of scheduling attempts to make per second.
+-max_schedule_penalty (default (1, mins))
+	Maximum delay between attempts to schedule a PENDING tasks.
+-max_status_update_batch_size (default 1000) [must be > 0]
+	The maximum number of status updates that can be processed in a batch.
+-max_task_event_batch_size (default 300) [must be > 0]
+	The maximum number of task state change events that can be processed in a batch.
+-max_tasks_per_job (default 4000) [must be > 0]
+	Maximum number of allowed tasks in a single job.
+-max_tasks_per_schedule_attempt (default 5) [must be > 0]
+	The maximum number of tasks to pick in a single scheduling attempt.
+-max_update_instance_failures (default 20000) [must be > 0]
+	Upper limit on the number of failures allowed during a job update. This helps cap potentially unbounded entries into storage.
+-min_offer_hold_time (default (5, mins))
+	Minimum amount of time to hold a resource offer before declining.
+-native_log_election_retries (default 20)
+	The maximum number of attempts to obtain a new log writer.
+-native_log_election_timeout (default (15, secs))
+	The timeout for a single attempt to obtain a new log writer.
+	Path to a file to store the native log data in.  If the parent directory doesnot exist it will be created.
+-native_log_quorum_size (default 1)
+	The size of the quorum required for all log mutations.
+-native_log_read_timeout (default (5, secs))
+	The timeout for doing log reads.
+-native_log_write_timeout (default (3, secs))
+	The timeout for doing log appends and truncations.
+	A zookeeper node for use by the native log to track the master coordinator.
+-offer_filter_duration (default (5, secs))
+	Duration after which we expect Mesos to re-offer unused resources. A short duration improves scheduling performance in smaller clusters, but might lead to resource starvation for other frameworks if you run many frameworks in your cluster.
+-offer_hold_jitter_window (default (1, mins))
+	Maximum amount of random jitter to add to the offer hold time window.
+-offer_reservation_duration (default (3, mins))
+	Time to reserve a agent's offers while trying to satisfy a task preempting another.
+-populate_discovery_info (default false)
+	If true, Aurora populates DiscoveryInfo field of Mesos TaskInfo.
+-preemption_delay (default (3, mins))
+	Time interval after which a pending task becomes eligible to preempt other tasks
+-preemption_slot_finder_modules (default [class org.apache.aurora.scheduler.preemptor.PendingTaskProcessorModule, class org.apache.aurora.scheduler.preemptor.PreemptionVictimFilterModule])
+  Guice modules for replacing preemption logic.
+-preemption_slot_hold_time (default (5, mins))
+	Time to hold a preemption slot found before it is discarded.
+-preemption_slot_search_interval (default (1, mins))
+	Time interval between pending task preemption slot searches.
+-receive_revocable_resources (default false)
+	Allows receiving revocable resource offers from Mesos.
+-reconciliation_explicit_batch_interval (default (5, secs))
+	Interval between explicit batch reconciliation requests.
+-reconciliation_explicit_batch_size (default 1000) [must be > 0]
+	Number of tasks in a single batch request sent to Mesos for explicit reconciliation.
+-reconciliation_explicit_interval (default (60, mins))
+	Interval on which scheduler will ask Mesos for status updates of all non-terminal tasks known to scheduler.
+-reconciliation_implicit_interval (default (60, mins))
+	Interval on which scheduler will ask Mesos for status updates of all non-terminal tasks known to Mesos.
+-reconciliation_initial_delay (default (1, mins))
+	Initial amount of time to delay task reconciliation after scheduler start up.
+-reconciliation_schedule_spread (default (30, mins))
+	Difference between explicit and implicit reconciliation intervals intended to create a non-overlapping task reconciliation schedule.
+-require_docker_use_executor (default true)
+	If false, Docker tasks may run without an executor (EXPERIMENTAL)
+-scheduling_max_batch_size (default 3) [must be > 0]
+	The maximum number of scheduling attempts that can be processed in a batch.
+-serverset_endpoint_name (default http)
+	Name of the scheduler endpoint published in ZooKeeper.
+	Path to shiro.ini for authentication and authorization configuration.
+-shiro_realm_modules (default [class])
+	Guice modules for configuring Shiro Realms.
+-sla_non_prod_metrics (default [])
+	Metric categories collected for non production tasks.
+-sla_prod_metrics (default [JOB_UPTIMES, PLATFORM_UPTIME, MEDIANS])
+	Metric categories collected for production tasks.
+-sla_stat_refresh_interval (default (1, mins))
+	The SLA stat refresh interval.
+-slow_query_log_threshold (default (25, ms))
+	Log all queries that take at least this long to execute.
+-slow_query_log_threshold (default (25, ms))
+	Log all queries that take at least this long to execute.
+-snapshot_hydrate_stores (default [locks, hosts, quota, job_updates])
+	Which H2-backed stores to fully hydrate on the Snapshot.
+-stat_retention_period (default (1, hrs))
+	Time for a stat to be retained in memory before expiring.
+-stat_sampling_interval (default (1, secs))
+	Statistic value sampling interval.
+-task_assigner_modules (default [class org.apache.aurora.scheduler.state.FirstFitTaskAssignerModule])
+  Guice modules for replacing task assignment logic.
+-thermos_executor_cpu (default 0.25)
+	The number of CPU cores to allocate for each instance of the executor.
+	Extra arguments to be passed to the thermos executor
+-thermos_executor_ram (default (128, MB))
+	The amount of RAM to allocate for each instance of the executor.
+-thermos_executor_resources (default [])
+	A comma separated list of additional resources to copy into the sandbox.Note: if thermos_executor_path is not the thermos_executor.pex file itself, this must include it.
+-thermos_home_in_sandbox (default false)
+	If true, changes HOME to the sandbox before running the executor. This primarily has the effect of causing the executor and runner to extract themselves into the sandbox.
+-transient_task_state_timeout (default (5, mins))
+	The amount of time after which to treat a task stuck in a transient state as LOST.
+-use_beta_db_task_store (default false)
+	Whether to use the experimental database-backed task store.
+-viz_job_url_prefix (default )
+	URL prefix for job container stats.
+	chroot path to use for the ZooKeeper connections
+	user:password to use when authenticating with ZooKeeper.
+-zk_in_proc (default false)
+	Launches an embedded zookeeper server for local testing causing -zk_endpoints to be ignored if specified.
+-zk_session_timeout (default (4, secs))
+	The ZooKeeper session timeout.
+-zk_use_curator (default true)
+	DEPRECATED: Uses Apache Curator as the zookeeper client; otherwise a copy of Twitter commons/zookeeper (the legacy library) is used.

Added: aurora/site/source/documentation/0.18.0/reference/
--- aurora/site/source/documentation/0.18.0/reference/ (added)
+++ aurora/site/source/documentation/0.18.0/reference/ Wed Jun 21 06:36:21 2017
@@ -0,0 +1,19 @@
+# HTTP endpoints
+There are a number of HTTP endpoints that the Aurora scheduler exposes. These allow various
+operational tasks to be performed on the scheduler. Below is an (incomplete) list of such endpoints
+and a brief explanation of what they do.
+## Leader health
+The /leaderhealth endpoint enables performing health checks on the scheduler instances inorder
+to forward requests to the leading scheduler. This is typically used by a load balancer such as
+HAProxy or AWS ELB.
+When a HTTP GET request is issued on this endpoint, it responds as follows:
+- If the instance that received the GET request is the leading scheduler, a HTTP status code of
+  `200 OK` is returned.
+- If the instance that received the GET request is not the leading scheduler but a leader does
+  exist, a HTTP status code of `503 SERVICE_UNAVAILABLE` is returned.
+- If no leader currently exists or the leader is unknown, a HTTP status code of `502 BAD_GATEWAY`
+  is returned.

View raw message