Return-Path: X-Original-To: apmail-aurora-commits-archive@minotaur.apache.org Delivered-To: apmail-aurora-commits-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 792AC19E17 for ; Mon, 28 Mar 2016 20:55:43 +0000 (UTC) Received: (qmail 95595 invoked by uid 500); 28 Mar 2016 20:55:43 -0000 Delivered-To: apmail-aurora-commits-archive@aurora.apache.org Received: (qmail 95495 invoked by uid 500); 28 Mar 2016 20:55:43 -0000 Mailing-List: contact commits-help@aurora.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@aurora.apache.org Delivered-To: mailing list commits@aurora.apache.org Received: (qmail 95325 invoked by uid 99); 28 Mar 2016 20:55:43 -0000 Received: from git1-us-west.apache.org (HELO git1-us-west.apache.org) (140.211.11.23) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 28 Mar 2016 20:55:43 +0000 Received: by git1-us-west.apache.org (ASF Mail Server at git1-us-west.apache.org, from userid 33) id E0CCFDFB79; Mon, 28 Mar 2016 20:55:42 +0000 (UTC) Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit From: serb@apache.org To: commits@aurora.apache.org Date: Mon, 28 Mar 2016 20:55:47 -0000 Message-Id: In-Reply-To: References: X-Mailer: ASF-Git Admin Mailer Subject: [6/7] aurora git commit: Reorganize Documentation http://git-wip-us.apache.org/repos/asf/aurora/blob/f28f41a7/docs/configuration-tutorial.md ---------------------------------------------------------------------- diff --git a/docs/configuration-tutorial.md b/docs/configuration-tutorial.md deleted file mode 100644 index 97664f3..0000000 --- a/docs/configuration-tutorial.md +++ /dev/null @@ -1,954 +0,0 @@ -Aurora Configuration Tutorial -============================= - -How to write Aurora configuration files, including feature descriptions -and best practices. When writing a configuration file, make use of -`aurora job inspect`. It takes the same job key and configuration file -arguments as `aurora job create` or `aurora update start`. It first ensures the -configuration parses, then outputs it in human-readable form. - -You should read this after going through the general [Aurora Tutorial](tutorial.md). - -- [Aurora Configuration Tutorial](#user-content-aurora-configuration-tutorial) - - [The Basics](#user-content-the-basics) - - [Use Bottom-To-Top Object Ordering](#user-content-use-bottom-to-top-object-ordering) - - [An Example Configuration File](#user-content-an-example-configuration-file) - - [Defining Process Objects](#user-content-defining-process-objects) - - [Getting Your Code Into The Sandbox](#user-content-getting-your-code-into-the-sandbox) - - [Defining Task Objects](#user-content-defining-task-objects) - - [SequentialTask: Running Processes in Parallel or Sequentially](#user-content-sequentialtask-running-processes-in-parallel-or-sequentially) - - [SimpleTask](#user-content-simpletask) - - [Combining tasks](#user-content-combining-tasks) - - [Defining Job Objects](#user-content-defining-job-objects) - - [The jobs List](#user-content-the-jobs-list) - - [Templating](#user-content-templating) - - [Templating 1: Binding in Pystachio](#user-content-templating-1-binding-in-pystachio) - - [Structurals in Pystachio / Aurora](#user-content-structurals-in-pystachio--aurora) - - [Mustaches Within Structurals](#user-content-mustaches-within-structurals) - - [Templating 2: Structurals Are Factories](#user-content-templating-2-structurals-are-factories) - - [A Second Way of Templating](#user-content-a-second-way-of-templating) - - [Advanced Binding](#user-content-advanced-binding) - - [Bind Syntax](#user-content-bind-syntax) - - [Binding Complex Objects](#user-content-binding-complex-objects) - - [Lists](#user-content-lists) - - [Maps](#user-content-maps) - - [Structurals](#user-content-structurals) - - [Structural Binding](#user-content-structural-binding) - - [Configuration File Writing Tips And Best Practices](#user-content-configuration-file-writing-tips-and-best-practices) - - [Use As Few .aurora Files As Possible](#user-content-use-as-few-aurora-files-as-possible) - - [Avoid Boilerplate](#user-content-avoid-boilerplate) - - [Thermos Uses bash, But Thermos Is Not bash](#user-content-thermos-uses-bash-but-thermos-is-not-bash) - - [Bad](#user-content-bad) - - [Good](#user-content-good) - - [Rarely Use Functions In Your Configurations](#user-content-rarely-use-functions-in-your-configurations) - - [Bad](#user-content-bad-1) - - [Good](#user-content-good-1) - -The Basics ----------- - -To run a job on Aurora, you must specify a configuration file that tells -Aurora what it needs to know to schedule the job, what Mesos needs to -run the tasks the job is made up of, and what Thermos needs to run the -processes that make up the tasks. This file must have -a`.aurora` suffix. - -A configuration file defines a collection of objects, along with parameter -values for their attributes. An Aurora configuration file contains the -following three types of objects: - -- Job -- Task -- Process - -A configuration also specifies a list of `Job` objects assigned -to the variable `jobs`. - -- jobs (list of defined Jobs to run) - -The `.aurora` file format is just Python. However, `Job`, `Task`, -`Process`, and other classes are defined by a type-checked dictionary -templating library called *Pystachio*, a powerful tool for -configuration specification and reuse. Pystachio objects are tailored -via {{}} surrounded templates. - -When writing your `.aurora` file, you may use any Pystachio datatypes, as -well as any objects shown in the [*Aurora+Thermos Configuration -Reference*](configuration-reference.md), without `import` statements - the -Aurora config loader injects them automatically. Other than that, an `.aurora` -file works like any other Python script. - -[*Aurora+Thermos Configuration Reference*](configuration-reference.md) -has a full reference of all Aurora/Thermos defined Pystachio objects. - -### Use Bottom-To-Top Object Ordering - -A well-structured configuration starts with structural templates (if -any). Structural templates encapsulate in their attributes all the -differences between Jobs in the configuration that are not directly -manipulated at the `Job` level, but typically at the `Process` or `Task` -level. For example, if certain processes are invoked with slightly -different settings or input. - -After structural templates, define, in order, `Process`es, `Task`s, and -`Job`s. - -Structural template names should be *UpperCamelCased* and their -instantiations are typically *UPPER\_SNAKE\_CASED*. `Process`, `Task`, -and `Job` names are typically *lower\_snake\_cased*. Indentation is typically 2 -spaces. - -An Example Configuration File ------------------------------ - -The following is a typical configuration file. Don't worry if there are -parts you don't understand yet, but you may want to refer back to this -as you read about its individual parts. Note that names surrounded by -curly braces {{}} are template variables, which the system replaces with -bound values for the variables. - - # --- templates here --- - class Profile(Struct): - package_version = Default(String, 'live') - java_binary = Default(String, '/usr/lib/jvm/java-1.7.0-openjdk/bin/java') - extra_jvm_options = Default(String, '') - parent_environment = Default(String, 'prod') - parent_serverset = Default(String, - '/foocorp/service/bird/{{parent_environment}}/bird') - - # --- processes here --- - main = Process( - name = 'application', - cmdline = '{{profile.java_binary}} -server -Xmx1792m ' - '{{profile.extra_jvm_options}} ' - '-jar application.jar ' - '-upstreamService {{profile.parent_serverset}}' - ) - - # --- tasks --- - base_task = SequentialTask( - name = 'application', - processes = [ - Process( - name = 'fetch', - cmdline = 'curl -O - https://packages.foocorp.com/{{profile.package_version}}/application.jar'), - ] - ) - - # not always necessary but often useful to have separate task - # resource classes - staging_task = base_task(resources = - Resources(cpu = 1.0, - ram = 2048*MB, - disk = 1*GB)) - production_task = base_task(resources = - Resources(cpu = 4.0, - ram = 2560*MB, - disk = 10*GB)) - - # --- job template --- - job_template = Job( - name = 'application', - role = 'myteam', - contact = 'myteam-team@foocorp.com', - instances = 20, - service = True, - task = production_task - ) - - # -- profile instantiations (if any) --- - PRODUCTION = Profile() - STAGING = Profile( - extra_jvm_options = '-Xloggc:gc.log', - parent_environment = 'staging' - ) - - # -- job instantiations -- - jobs = [ - job_template(cluster = 'cluster1', environment = 'prod') - .bind(profile = PRODUCTION), - - job_template(cluster = 'cluster2', environment = 'prod') - .bind(profile = PRODUCTION), - - job_template(cluster = 'cluster1', - environment = 'staging', - service = False, - task = staging_task, - instances = 2) - .bind(profile = STAGING), - ] - -## Defining Process Objects - -Processes are handled by the Thermos system. A process is a single -executable step run as a part of an Aurora task, which consists of a -bash-executable statement. - -The key (and required) `Process` attributes are: - -- `name`: Any string which is a valid Unix filename (no slashes, - NULLs, or leading periods). The `name` value must be unique relative - to other Processes in a `Task`. -- `cmdline`: A command line run in a bash subshell, so you can use - bash scripts. Nothing is supplied for command-line arguments, - so `$*` is unspecified. - -Many tiny processes make managing configurations more difficult. For -example, the following is a bad way to define processes. - - copy = Process( - name = 'copy', - cmdline = 'curl -O https://packages.foocorp.com/app.zip' - ) - unpack = Process( - name = 'unpack', - cmdline = 'unzip app.zip' - ) - remove = Process( - name = 'remove', - cmdline = 'rm -f app.zip' - ) - run = Process( - name = 'app', - cmdline = 'java -jar app.jar' - ) - run_task = Task( - processes = [copy, unpack, remove, run], - constraints = order(copy, unpack, remove, run) - ) - -Since `cmdline` runs in a bash subshell, you can chain commands -with `&&` or `||`. - -When defining a `Task` that is just a list of Processes run in a -particular order, use `SequentialTask`, as described in the [*Defining* -`Task` *Objects*](#Task) section. The following simplifies and combines the -above multiple `Process` definitions into just two. - - stage = Process( - name = 'stage', - cmdline = 'curl -O https://packages.foocorp.com/app.zip && ' - 'unzip app.zip && rm -f app.zip') - - run = Process(name = 'app', cmdline = 'java -jar app.jar') - - run_task = SequentialTask(processes = [stage, run]) - -`Process` also has optional attributes to customize its behaviour. Details can be found in the [*Aurora+Thermos Configuration Reference*](configuration-reference.md#process-objects). - - -## Getting Your Code Into The Sandbox - -When using Aurora, you need to get your executable code into its "sandbox", specifically -the Task sandbox where the code executes for the Processes that make up that Task. - -Each Task has a sandbox created when the Task starts and garbage -collected when it finishes. All of a Task's processes run in its -sandbox, so processes can share state by using a shared current -working directory. - -Typically, you save this code somewhere. You then need to define a Process -in your `.aurora` configuration file that fetches the code from that somewhere -to where the slave can see it. For a public cloud, that can be anywhere public on -the Internet, such as S3. For a private cloud internal storage, you need to put in -on an accessible HDFS cluster or similar storage. - -The template for this Process is: - - = Process( - name = '' - cmdline = '' - ) - -Note: Be sure the extracted code archive has an executable. - -## Defining Task Objects - -Tasks are handled by Mesos. A task is a collection of processes that -runs in a shared sandbox. It's the fundamental unit Aurora uses to -schedule the datacenter; essentially what Aurora does is find places -in the cluster to run tasks. - -The key (and required) parts of a Task are: - -- `name`: A string giving the Task's name. By default, if a Task is - not given a name, it inherits the first name in its Process list. - -- `processes`: An unordered list of Process objects bound to the Task. - The value of the optional `constraints` attribute affects the - contents as a whole. Currently, the only constraint, `order`, determines if - the processes run in parallel or sequentially. - -- `resources`: A `Resource` object defining the Task's resource - footprint. A `Resource` object has three attributes: - - `cpu`: A Float, the fractional number of cores the Task - requires. - - `ram`: An Integer, RAM bytes the Task requires. - - `disk`: An integer, disk bytes the Task requires. - -A basic Task definition looks like: - - Task( - name="hello_world", - processes=[Process(name = "hello_world", cmdline = "echo hello world")], - resources=Resources(cpu = 1.0, - ram = 1*GB, - disk = 1*GB)) - -A Task has optional attributes to customize its behaviour. Details can be found in the [*Aurora+Thermos Configuration Reference*](configuration-reference.md#task-object) - - -### SequentialTask: Running Processes in Parallel or Sequentially - -By default, a Task with several Processes runs them in parallel. There -are two ways to run Processes sequentially: - -- Include an `order` constraint in the Task definition's `constraints` - attribute whose arguments specify the processes' run order: - - Task( ... processes=[process1, process2, process3], - constraints = order(process1, process2, process3), ...) - -- Use `SequentialTask` instead of `Task`; it automatically runs - processes in the order specified in the `processes` attribute. No - `constraint` parameter is needed: - - SequentialTask( ... processes=[process1, process2, process3] ...) - -### SimpleTask - -For quickly creating simple tasks, use the `SimpleTask` helper. It -creates a basic task from a provided name and command line using a -default set of resources. For example, in a .`aurora` configuration -file: - - SimpleTask(name="hello_world", command="echo hello world") - -is equivalent to - - Task(name="hello_world", - processes=[Process(name = "hello_world", cmdline = "echo hello world")], - resources=Resources(cpu = 1.0, - ram = 1*GB, - disk = 1*GB)) - -The simplest idiomatic Job configuration thus becomes: - - import os - hello_world_job = Job( - task=SimpleTask(name="hello_world", command="echo hello world"), - role=os.getenv('USER'), - cluster="cluster1") - -When written to `hello_world.aurora`, you invoke it with a simple -`aurora job create cluster1/$USER/test/hello_world hello_world.aurora`. - -### Combining tasks - -`Tasks.concat`(synonym,`concat_tasks`) and -`Tasks.combine`(synonym,`combine_tasks`) merge multiple Task definitions -into a single Task. It may be easier to define complex Jobs -as smaller constituent Tasks. But since a Job only includes a single -Task, the subtasks must be combined before using them in a Job. -Smaller Tasks can also be reused between Jobs, instead of having to -repeat their definition for multiple Jobs. - -With both methods, the merged Task takes the first Task's name. The -difference between the two is the result Task's process ordering. - -- `Tasks.combine` runs its subtasks' processes in no particular order. - The new Task's resource consumption is the sum of all its subtasks' - consumption. - -- `Tasks.concat` runs its subtasks in the order supplied, with each - subtask's processes run serially between tasks. It is analogous to - the `order` constraint helper, except at the Task level instead of - the Process level. The new Task's resource consumption is the - maximum value specified by any subtask for each Resource attribute - (cpu, ram and disk). - -For example, given the following: - - setup_task = Task( - ... - processes=[download_interpreter, update_zookeeper], - # It is important to note that {{Tasks.concat}} has - # no effect on the ordering of the processes within a task; - # hence the necessity of the {{order}} statement below - # (otherwise, the order in which {{download_interpreter}} - # and {{update_zookeeper}} run will be non-deterministic) - constraints=order(download_interpreter, update_zookeeper), - ... - ) - - run_task = SequentialTask( - ... - processes=[download_application, start_application], - ... - ) - - combined_task = Tasks.concat(setup_task, run_task) - -The `Tasks.concat` command merges the two Tasks into a single Task and -ensures all processes in `setup_task` run before the processes -in `run_task`. Conceptually, the task is reduced to: - - task = Task( - ... - processes=[download_interpreter, update_zookeeper, - download_application, start_application], - constraints=order(download_interpreter, update_zookeeper, - download_application, start_application), - ... - ) - -In the case of `Tasks.combine`, the two schedules run in parallel: - - task = Task( - ... - processes=[download_interpreter, update_zookeeper, - download_application, start_application], - constraints=order(download_interpreter, update_zookeeper) + - order(download_application, start_application), - ... - ) - -In the latter case, each of the two sequences may operate in parallel. -Of course, this may not be the intended behavior (for example, if -the `start_application` Process implicitly relies -upon `download_interpreter`). Make sure you understand the difference -between using one or the other. - -## Defining Job Objects - -A job is a group of identical tasks that Aurora can run in a Mesos cluster. - -A `Job` object is defined by the values of several attributes, some -required and some optional. The required attributes are: - -- `task`: Task object to bind to this job. Note that a Job can - only take a single Task. - -- `role`: Job's role account; in other words, the user account to run - the job as on a Mesos cluster machine. A common value is - `os.getenv('USER')`; using a Python command to get the user who - submits the job request. The other common value is the service - account that runs the job, e.g. `www-data`. - -- `environment`: Job's environment, typical values - are `devel`, `test`, or `prod`. - -- `cluster`: Aurora cluster to schedule the job in, defined in - `/etc/aurora/clusters.json` or `~/.clusters.json`. You can specify - jobs where the only difference is the `cluster`, then at run time - only run the Job whose job key includes your desired cluster's name. - -You usually see a `name` parameter. By default, `name` inherits its -value from the Job's associated Task object, but you can override this -default. For these four parameters, a Job definition might look like: - - foo_job = Job( name = 'foo', cluster = 'cluster1', - role = os.getenv('USER'), environment = 'prod', - task = foo_task) - -In addition to the required attributes, there are several optional -attributes. Details can be found in the [Aurora+Thermos Configuration Reference](configuration-reference.md#job-objects). - - -## The jobs List - -At the end of your `.aurora` file, you need to specify a list of the -file's defined Jobs. For example, the following exports the jobs `job1`, -`job2`, and `job3`. - - jobs = [job1, job2, job3] - -This allows the aurora client to invoke commands on those jobs, such as -starting, updating, or killing them. - -Templating ----------- - -The `.aurora` file format is just Python. However, `Job`, `Task`, -`Process`, and other classes are defined by a templating library called -*Pystachio*, a powerful tool for configuration specification and reuse. - -[Aurora+Thermos Configuration Reference](configuration-reference.md) -has a full reference of all Aurora/Thermos defined Pystachio objects. - -When writing your `.aurora` file, you may use any Pystachio datatypes, as -well as any objects shown in the *Aurora+Thermos Configuration -Reference* without `import` statements - the Aurora config loader -injects them automatically. Other than that the `.aurora` format -works like any other Python script. - -### Templating 1: Binding in Pystachio - -Pystachio uses the visually distinctive {{}} to indicate template -variables. These are often called "mustache variables" after the -similarly appearing variables in the Mustache templating system and -because the curly braces resemble mustaches. - -If you are familiar with the Mustache system, templates in Pystachio -have significant differences. They have no nesting, joining, or -inheritance semantics. On the other hand, when evaluated, templates -are evaluated iteratively, so this affords some level of indirection. - -Let's start with the simplest template; text with one -variable, in this case `name`; - - Hello {{name}} - -If we evaluate this as is, we'd get back: - - Hello - -If a template variable doesn't have a value, when evaluated it's -replaced with nothing. If we add a binding to give it a value: - - { "name" : "Tom" } - -We'd get back: - - Hello Tom - -Every Pystachio object has an associated `.bind` method that can bind -values to {{}} variables. Bindings are not immediately evaluated. -Instead, they are evaluated only when the interpolated value of the -object is necessary, e.g. for performing equality or serializing a -message over the wire. - -Objects with and without mustache templated variables behave -differently: - - >>> Float(1.5) - Float(1.5) - - >>> Float('{{x}}.5') - Float({{x}}.5) - - >>> Float('{{x}}.5').bind(x = 1) - Float(1.5) - - >>> Float('{{x}}.5').bind(x = 1) == Float(1.5) - True - - >>> contextual_object = String('{{metavar{{number}}}}').bind( - ... metavar1 = "first", metavar2 = "second") - - >>> contextual_object - String({{metavar{{number}}}}) - - >>> contextual_object.bind(number = 1) - String(first) - - >>> contextual_object.bind(number = 2) - String(second) - -You usually bind simple key to value pairs, but you can also bind three -other objects: lists, dictionaries, and structurals. These will be -described in detail later. - -### Structurals in Pystachio / Aurora - -Most Aurora/Thermos users don't ever (knowingly) interact with `String`, -`Float`, or `Integer` Pystashio objects directly. Instead they interact -with derived structural (`Struct`) objects that are collections of -fundamental and structural objects. The structural object components are -called *attributes*. Aurora's most used structural objects are `Job`, -`Task`, and `Process`: - - class Process(Struct): - cmdline = Required(String) - name = Required(String) - max_failures = Default(Integer, 1) - daemon = Default(Boolean, False) - ephemeral = Default(Boolean, False) - min_duration = Default(Integer, 5) - final = Default(Boolean, False) - -Construct default objects by following the object's type with (). If you -want an attribute to have a value different from its default, include -the attribute name and value inside the parentheses. - - >>> Process() - Process(daemon=False, max_failures=1, ephemeral=False, - min_duration=5, final=False) - -Attribute values can be template variables, which then receive specific -values when creating the object. - - >>> Process(cmdline = 'echo {{message}}') - Process(daemon=False, max_failures=1, ephemeral=False, min_duration=5, - cmdline=echo {{message}}, final=False) - - >>> Process(cmdline = 'echo {{message}}').bind(message = 'hello world') - Process(daemon=False, max_failures=1, ephemeral=False, min_duration=5, - cmdline=echo hello world, final=False) - -A powerful binding property is that all of an object's children inherit its -bindings: - - >>> List(Process)([ - ... Process(name = '{{prefix}}_one'), - ... Process(name = '{{prefix}}_two') - ... ]).bind(prefix = 'hello') - ProcessList( - Process(daemon=False, name=hello_one, max_failures=1, ephemeral=False, min_duration=5, final=False), - Process(daemon=False, name=hello_two, max_failures=1, ephemeral=False, min_duration=5, final=False) - ) - -Remember that an Aurora Job contains Tasks which contain Processes. A -Job level binding is inherited by its Tasks and all their Processes. -Similarly a Task level binding is available to that Task and its -Processes but is *not* visible at the Job level (inheritance is a -one-way street.) - -#### Mustaches Within Structurals - -When you define a `Struct` schema, one powerful, but confusing, feature -is that all of that structure's attributes are Mustache variables within -the enclosing scope *once they have been populated*. - -For example, when `Process` is defined above, all its attributes such as -{{`name`}}, {{`cmdline`}}, {{`max_failures`}} etc., are all immediately -defined as Mustache variables, implicitly bound into the `Process`, and -inherit all child objects once they are defined. - -Thus, you can do the following: - - >>> Process(name = "installer", cmdline = "echo {{name}} is running") - Process(daemon=False, name=installer, max_failures=1, ephemeral=False, min_duration=5, - cmdline=echo installer is running, final=False) - -WARNING: This binding only takes place in one direction. For example, -the following does NOT work and does not set the `Process` `name` -attribute's value. - - >>> Process().bind(name = "installer") - Process(daemon=False, max_failures=1, ephemeral=False, min_duration=5, final=False) - -The following is also not possible and results in an infinite loop that -attempts to resolve `Process.name`. - - >>> Process(name = '{{name}}').bind(name = 'installer') - -Do not confuse Structural attributes with bound Mustache variables. -Attributes are implicitly converted to Mustache variables but not vice -versa. - -### Templating 2: Structurals Are Factories - -#### A Second Way of Templating - -A second templating method is both as powerful as the aforementioned and -often confused with it. This method is due to automatic conversion of -Struct attributes to Mustache variables as described above. - -Suppose you create a Process object: - - >>> p = Process(name = "process_one", cmdline = "echo hello world") - - >>> p - Process(daemon=False, name=process_one, max_failures=1, ephemeral=False, min_duration=5, - cmdline=echo hello world, final=False) - -This `Process` object, "`p`", can be used wherever a `Process` object is -needed. It can also be reused by changing the value(s) of its -attribute(s). Here we change its `name` attribute from `process_one` to -`process_two`. - - >>> p(name = "process_two") - Process(daemon=False, name=process_two, max_failures=1, ephemeral=False, min_duration=5, - cmdline=echo hello world, final=False) - -Template creation is a common use for this technique: - - >>> Daemon = Process(daemon = True) - >>> logrotate = Daemon(name = 'logrotate', cmdline = './logrotate conf/logrotate.conf') - >>> mysql = Daemon(name = 'mysql', cmdline = 'bin/mysqld --safe-mode') - -### Advanced Binding - -As described above, `.bind()` binds simple strings or numbers to -Mustache variables. In addition to Structural types formed by combining -atomic types, Pystachio has two container types; `List` and `Map` which -can also be bound via `.bind()`. - -#### Bind Syntax - -The `bind()` function can take Python dictionaries or `kwargs` -interchangeably (when "`kwargs`" is in a function definition, `kwargs` -receives a Python dictionary containing all keyword arguments after the -formal parameter list). - - >>> String('{{foo}}').bind(foo = 'bar') == String('{{foo}}').bind({'foo': 'bar'}) - True - -Bindings done "closer" to the object in question take precedence: - - >>> p = Process(name = '{{context}}_process') - >>> t = Task().bind(context = 'global') - >>> t(processes = [p, p.bind(context = 'local')]) - Task(processes=ProcessList( - Process(daemon=False, name=global_process, max_failures=1, ephemeral=False, final=False, - min_duration=5), - Process(daemon=False, name=local_process, max_failures=1, ephemeral=False, final=False, - min_duration=5) - )) - -#### Binding Complex Objects - -##### Lists - - >>> fibonacci = List(Integer)([1, 1, 2, 3, 5, 8, 13]) - >>> String('{{fib[4]}}').bind(fib = fibonacci) - String(5) - -##### Maps - - >>> first_names = Map(String, String)({'Kent': 'Clark', 'Wayne': 'Bruce', 'Prince': 'Diana'}) - >>> String('{{first[Kent]}}').bind(first = first_names) - String(Clark) - -##### Structurals - - >>> String('{{p.cmdline}}').bind(p = Process(cmdline = "echo hello world")) - String(echo hello world) - -### Structural Binding - -Use structural templates when binding more than two or three individual -values at the Job or Task level. For fewer than two or three, standard -key to string binding is sufficient. - -Structural binding is a very powerful pattern and is most useful in -Aurora/Thermos for doing Structural configuration. For example, you can -define a job profile. The following profile uses `HDFS`, the Hadoop -Distributed File System, to designate a file's location. `HDFS` does -not come with Aurora, so you'll need to either install it separately -or change the way the dataset is designated. - - class Profile(Struct): - version = Required(String) - environment = Required(String) - dataset = Default(String, hdfs://home/aurora/data/{{environment}}') - - PRODUCTION = Profile(version = 'live', environment = 'prod') - DEVEL = Profile(version = 'latest', - environment = 'devel', - dataset = 'hdfs://home/aurora/data/test') - TEST = Profile(version = 'latest', environment = 'test') - - JOB_TEMPLATE = Job( - name = 'application', - role = 'myteam', - cluster = 'cluster1', - environment = '{{profile.environment}}', - task = SequentialTask( - name = 'task', - resources = Resources(cpu = 2, ram = 4*GB, disk = 8*GB), - processes = [ - Process(name = 'main', cmdline = 'java -jar application.jar -hdfsPath - {{profile.dataset}}') - ] - ) - ) - - jobs = [ - JOB_TEMPLATE(instances = 100).bind(profile = PRODUCTION), - JOB_TEMPLATE.bind(profile = DEVEL), - JOB_TEMPLATE.bind(profile = TEST), - ] - -In this case, a custom structural "Profile" is created to self-document -the configuration to some degree. This also allows some schema -"type-checking", and for default self-substitution, e.g. in -`Profile.dataset` above. - -So rather than a `.bind()` with a half-dozen substituted variables, you -can bind a single object that has sensible defaults stored in a single -place. - -Configuration File Writing Tips And Best Practices --------------------------------------------------- - -### Use As Few .aurora Files As Possible - -When creating your `.aurora` configuration, try to keep all versions of -a particular job within the same `.aurora` file. For example, if you -have separate jobs for `cluster1`, `cluster1` staging, `cluster1` -testing, and`cluster2`, keep them as close together as possible. - -Constructs shared across multiple jobs owned by your team (e.g. -team-level defaults or structural templates) can be split into separate -`.aurora`files and included via the `include` directive. - -### Avoid Boilerplate - -If you see repetition or find yourself copy and pasting any parts of -your configuration, it's likely an opportunity for templating. Take the -example below: - -`redundant.aurora` contains: - - download = Process( - name = 'download', - cmdline = 'wget http://www.python.org/ftp/python/2.7.3/Python-2.7.3.tar.bz2', - max_failures = 5, - min_duration = 1) - - unpack = Process( - name = 'unpack', - cmdline = 'rm -rf Python-2.7.3 && tar xzf Python-2.7.3.tar.bz2', - max_failures = 5, - min_duration = 1) - - build = Process( - name = 'build', - cmdline = 'pushd Python-2.7.3 && ./configure && make && popd', - max_failures = 1) - - email = Process( - name = 'email', - cmdline = 'echo Success | mail feynman@tmc.com', - max_failures = 5, - min_duration = 1) - - build_python = Task( - name = 'build_python', - processes = [download, unpack, build, email], - constraints = [Constraint(order = ['download', 'unpack', 'build', 'email'])]) - -As you'll notice, there's a lot of repetition in the `Process` -definitions. For example, almost every process sets a `max_failures` -limit to 5 and a `min_duration` to 1. This is an opportunity for factoring -into a common process template. - -Furthermore, the Python version is repeated everywhere. This can be -bound via structural templating as described in the [Advanced Binding](#AdvancedBinding) -section. - -`less_redundant.aurora` contains: - - class Python(Struct): - version = Required(String) - base = Default(String, 'Python-{{version}}') - package = Default(String, '{{base}}.tar.bz2') - - ReliableProcess = Process( - max_failures = 5, - min_duration = 1) - - download = ReliableProcess( - name = 'download', - cmdline = 'wget http://www.python.org/ftp/python/{{python.version}}/{{python.package}}') - - unpack = ReliableProcess( - name = 'unpack', - cmdline = 'rm -rf {{python.base}} && tar xzf {{python.package}}') - - build = ReliableProcess( - name = 'build', - cmdline = 'pushd {{python.base}} && ./configure && make && popd', - max_failures = 1) - - email = ReliableProcess( - name = 'email', - cmdline = 'echo Success | mail {{role}}@foocorp.com') - - build_python = SequentialTask( - name = 'build_python', - processes = [download, unpack, build, email]).bind(python = Python(version = "2.7.3")) - -### Thermos Uses bash, But Thermos Is Not bash - -#### Bad - -Many tiny Processes makes for harder to manage configurations. - - copy = Process( - name = 'copy', - cmdline = 'rcp user@my_machine:my_application .' - ) - - unpack = Process( - name = 'unpack', - cmdline = 'unzip app.zip' - ) - - remove = Process( - name = 'remove', - cmdline = 'rm -f app.zip' - ) - - run = Process( - name = 'app', - cmdline = 'java -jar app.jar' - ) - - run_task = Task( - processes = [copy, unpack, remove, run], - constraints = order(copy, unpack, remove, run) - ) - -#### Good - -Each `cmdline` runs in a bash subshell, so you have the full power of -bash. Chaining commands with `&&` or `||` is almost always the right -thing to do. - -Also for Tasks that are simply a list of processes that run one after -another, consider using the `SequentialTask` helper which applies a -linear ordering constraint for you. - - stage = Process( - name = 'stage', - cmdline = 'rcp user@my_machine:my_application . && unzip app.zip && rm -f app.zip') - - run = Process(name = 'app', cmdline = 'java -jar app.jar') - - run_task = SequentialTask(processes = [stage, run]) - -### Rarely Use Functions In Your Configurations - -90% of the time you define a function in a `.aurora` file, you're -probably Doing It Wrong(TM). - -#### Bad - - def get_my_task(name, user, cpu, ram, disk): - return Task( - name = name, - user = user, - processes = [STAGE_PROCESS, RUN_PROCESS], - constraints = order(STAGE_PROCESS, RUN_PROCESS), - resources = Resources(cpu = cpu, ram = ram, disk = disk) - ) - - task_one = get_my_task('task_one', 'feynman', 1.0, 32*MB, 1*GB) - task_two = get_my_task('task_two', 'feynman', 2.0, 64*MB, 1*GB) - -#### Good - -This one is more idiomatic. Forced keyword arguments prevents accidents, -e.g. constructing a task with "32*MB" when you mean 32MB of ram and not -disk. Less proliferation of task-construction techniques means -easier-to-read, quicker-to-understand, and a more composable -configuration. - - TASK_TEMPLATE = SequentialTask( - user = 'wickman', - processes = [STAGE_PROCESS, RUN_PROCESS], - ) - - task_one = TASK_TEMPLATE( - name = 'task_one', - resources = Resources(cpu = 1.0, ram = 32*MB, disk = 1*GB) ) - - task_two = TASK_TEMPLATE( - name = 'task_two', - resources = Resources(cpu = 2.0, ram = 64*MB, disk = 1*GB) - ) http://git-wip-us.apache.org/repos/asf/aurora/blob/f28f41a7/docs/cron-jobs.md ---------------------------------------------------------------------- diff --git a/docs/cron-jobs.md b/docs/cron-jobs.md deleted file mode 100644 index 0f98425..0000000 --- a/docs/cron-jobs.md +++ /dev/null @@ -1,131 +0,0 @@ -# Cron Jobs - -Aurora supports execution of scheduled jobs on a Mesos cluster using cron-style syntax. - -- [Overview](#overview) -- [Collision Policies](#collision-policies) - - [KILL_EXISTING](#kill_existing) - - [CANCEL_NEW](#cancel_new) -- [Failure recovery](#failure-recovery) -- [Interacting with cron jobs via the Aurora CLI](#interacting-with-cron-jobs-via-the-aurora-cli) - - [cron schedule](#cron-schedule) - - [cron deschedule](#cron-deschedule) - - [cron start](#cron-start) - - [job killall, job restart, job kill](#job-killall-job-restart-job-kill) -- [Technical Note About Syntax](#technical-note-about-syntax) -- [Caveats](#caveats) - - [Failovers](#failovers) - - [Collision policy is best-effort](#collision-policy-is-best-effort) - - [Timezone Configuration](#timezone-configuration) - -## Overview - -A job is identified as a cron job by the presence of a -`cron_schedule` attribute containing a cron-style schedule in the -[`Job`](configuration-reference.md#job-objects) object. Examples of cron schedules -include "every 5 minutes" (`*/5 * * * *`), "Fridays at 17:00" (`* 17 * * FRI`), and -"the 1st and 15th day of the month at 03:00" (`0 3 1,15 *`). - -Example (available in the [Vagrant environment](vagrant.md)): - - $ cat /vagrant/examples/job/cron_hello_world.aurora - # cron_hello_world.aurora - # A cron job that runs every 5 minutes. - jobs = [ - Job( - cluster = 'devcluster', - role = 'www-data', - environment = 'test', - name = 'cron_hello_world', - cron_schedule = '*/5 * * * *', - task = SimpleTask( - 'cron_hello_world', - 'echo "Hello world from cron, the time is now $(date --rfc-822)"'), - ), - ] - -## Collision Policies - -The `cron_collision_policy` field specifies the scheduler's behavior when a new cron job is -triggered while an older run hasn't finished. The scheduler has two policies available, -[KILL_EXISTING](#kill_existing) and [CANCEL_NEW](#cancel_new). - -### KILL_EXISTING - -The default policy - on a collision the old instances are killed and a instances with the current -configuration are started. - -### CANCEL_NEW - -On a collision the new run is cancelled. - -Note that the use of this flag is likely a code smell - interrupted cron jobs should be able -to recover their progress on a subsequent invocation, otherwise they risk having their work queue -grow faster than they can process it. - -## Failure recovery - -Unlike with services, which aurora will always re-execute regardless of exit status, instances of -cron jobs retry according to the `max_task_failures` attribute of the -[Task](configuration-reference.md#task-objects) object. To get "run-until-success" semantics, -set `max_task_failures` to `-1`. - -## Interacting with cron jobs via the Aurora CLI - -Most interaction with cron jobs takes place using the `cron` subcommand. See `aurora cron -h` -for up-to-date usage instructions. - -### cron schedule -Schedules a new cron job on the Aurora cluster for later runs or replaces the existing cron template -with a new one. Only future runs will be affected, any existing active tasks are left intact. - - $ aurora cron schedule devcluster/www-data/test/cron_hello_world /vagrant/examples/jobs/cron_hello_world.aurora - -### cron deschedule -Deschedules a cron job, preventing future runs but allowing current runs to complete. - - $ aurora cron deschedule devcluster/www-data/test/cron_hello_world - -### cron start -Start a cron job immediately, outside of its normal cron schedule. - - $ aurora cron start devcluster/www-data/test/cron_hello_world - -### job killall, job restart, job kill -Cron jobs create instances running on the cluster that you can interact with like normal Aurora -tasks with `job kill` and `job restart`. - -## Technical Note About Syntax - -`cron_schedule` uses a restricted subset of BSD crontab syntax. While the -execution engine currently uses Quartz, the schedule parsing is custom, a subset of FreeBSD -[crontab(5)](http://www.freebsd.org/cgi/man.cgi?crontab(5)) syntax. See -[the source](https://github.com/apache/aurora/blob/master/src/main/java/org/apache/aurora/scheduler/cron/CrontabEntry.java#L106-L124) -for details. - -## Caveats - -### Failovers -No failover recovery. Aurora does not record the latest minute it fired -triggers for across failovers. Therefore it's possible to miss triggers -on failover. Note that this behavior may change in the future. - -It's necessary to sync time between schedulers with something like `ntpd`. -Clock skew could cause double or missed triggers in the case of a failover. - -### Collision policy is best-effort -Aurora aims to always have *at least one copy* of a given instance running at a time - it's -an AP system, meaning it chooses Availability and Partition Tolerance at the expense of -Consistency. - -If your collision policy was `CANCEL_NEW` and a task has terminated but -Aurora has not noticed this Aurora will go ahead and create your new -task. - -If your collision policy was `KILL_EXISTING` and a task was marked `LOST` -but not yet GCed Aurora will go ahead and create your new task without -attempting to kill the old one (outside the GC interval). - -### Timezone Configuration -Cron timezone is configured indepdendently of JVM timezone with the `-cron_timezone` flag and -defaults to UTC. http://git-wip-us.apache.org/repos/asf/aurora/blob/f28f41a7/docs/deploying-aurora-scheduler.md ---------------------------------------------------------------------- diff --git a/docs/deploying-aurora-scheduler.md b/docs/deploying-aurora-scheduler.md deleted file mode 100644 index 03bfdba..0000000 --- a/docs/deploying-aurora-scheduler.md +++ /dev/null @@ -1,379 +0,0 @@ -# Deploying the Aurora Scheduler - -When setting up your cluster, you will install the scheduler on a small number (usually 3 or 5) of -machines. This guide helps you get the scheduler set up and troubleshoot some common hurdles. - -- [Installing Aurora](#installing-aurora) - - [Creating the Distribution .zip File (Optional)](#creating-the-distribution-zip-file-optional) - - [Installing Aurora](#installing-aurora-1) -- [Configuring Aurora](#configuring-aurora) - - [A Note on Configuration](#a-note-on-configuration) - - [Replicated Log Configuration](#replicated-log-configuration) - - [Initializing the Replicated Log](#initializing-the-replicated-log) - - [Storage Performance Considerations](#storage-performance-considerations) - - [Network considerations](#network-considerations) - - [Considerations for running jobs in docker](#considerations-for-running-jobs-in-docker) - - [Security Considerations](#security-considerations) - - [Configuring Resource Oversubscription](#configuring-resource-oversubscription) - - [Process Logs](#process-logs) -- [Running Aurora](#running-aurora) - - [Maintaining an Aurora Installation](#maintaining-an-aurora-installation) - - [Monitoring](#monitoring) - - [Running stateful services](#running-stateful-services) - - [Dedicated attribute](#dedicated-attribute) - - [Syntax](#syntax) - - [Example](#example) -- [Best practices](#best-practices) - - [Diversity](#diversity) -- [Common problems](#common-problems) - - [Replicated log not initialized](#replicated-log-not-initialized) - - [Symptoms](#symptoms) - - [Solution](#solution) - - [Scheduler not registered](#scheduler-not-registered) - - [Symptoms](#symptoms-1) - - [Solution](#solution-1) -- [Changing Scheduler Quorum Size](#changing-scheduler-quorum-size) - - [Preparation](#preparation) - - [Adding New Schedulers](#adding-new-schedulers) - -## Installing Aurora -The Aurora scheduler is a standalone Java server. As part of the build process it creates a bundle -of all its dependencies, with the notable exceptions of the JVM and libmesos. Each target server -should have a JVM (Java 8 or higher) and libmesos (0.25.0) installed. - -### Creating the Distribution .zip File (Optional) -To create a distribution for installation you will need build tools installed. On Ubuntu this can be -done with `sudo apt-get install build-essential default-jdk`. - - git clone http://git-wip-us.apache.org/repos/asf/aurora.git - cd aurora - ./gradlew distZip - -Copy the generated `dist/distributions/aurora-scheduler-*.zip` to each node that will run a scheduler. - -### Installing Aurora -Extract the aurora-scheduler zip file. The example configurations assume it is extracted to -`/usr/local/aurora-scheduler`. - - sudo unzip dist/distributions/aurora-scheduler-*.zip -d /usr/local - sudo ln -nfs "$(ls -dt /usr/local/aurora-scheduler-* | head -1)" /usr/local/aurora-scheduler - -## Configuring Aurora - -### A Note on Configuration -Like Mesos, Aurora uses command-line flags for runtime configuration. As such the Aurora -"configuration file" is typically a `scheduler.sh` shell script of the form. - - #!/bin/bash - AURORA_HOME=/usr/local/aurora-scheduler - - # Flags controlling the JVM. - JAVA_OPTS=( - -Xmx2g - -Xms2g - # GC tuning, etc. - ) - - # Flags controlling the scheduler. - AURORA_FLAGS=( - -http_port=8081 - # Log configuration, etc. - ) - - # Environment variables controlling libmesos - export JAVA_HOME=... - export GLOG_v=1 - export LIBPROCESS_PORT=8083 - - JAVA_OPTS="${JAVA_OPTS[*]}" exec "$AURORA_HOME/bin/aurora-scheduler" "${AURORA_FLAGS[@]}" - -That way Aurora's current flags are visible in `ps` and in the `/vars` admin endpoint. - -Examples are available under `examples/scheduler/`. For a list of available Aurora flags and their -documentation, see [this document](scheduler-configuration.md). - -### Replicated Log Configuration -All Aurora state is persisted to a replicated log. This includes all jobs Aurora is running -including where in the cluster they are being run and the configuration for running them, as -well as other information such as metadata needed to reconnect to the Mesos master, resource -quotas, and any other locks in place. - -Aurora schedulers use ZooKeeper to discover log replicas and elect a leader. Only one scheduler is -leader at a given time - the other schedulers follow log writes and prepare to take over as leader -but do not communicate with the Mesos master. Either 3 or 5 schedulers are recommended in a -production deployment depending on failure tolerance and they must have persistent storage. - -In a cluster with `N` schedulers, the flag `-native_log_quorum_size` should be set to -`floor(N/2) + 1`. So in a cluster with 1 scheduler it should be set to `1`, in a cluster with 3 it -should be set to `2`, and in a cluster of 5 it should be set to `3`. - - Number of schedulers (N) | ```-native_log_quorum_size``` setting (```floor(N/2) + 1```) - ------------------------ | ------------------------------------------------------------- - 1 | 1 - 3 | 2 - 5 | 3 - 7 | 4 - -*Incorrectly setting this flag will cause data corruption to occur!* - -See [this document](storage-config.md#scheduler-storage-configuration-flags) for more replicated -log and storage configuration options. - -## Initializing the Replicated Log -Before you start Aurora you will also need to initialize the log on a majority of the schedulers. - - mesos-log initialize --path="/path/to/native/log" - -The `--path` flag should match the `--native_log_file_path` flag to the scheduler. -Failing to do this will result the following message when you try to start the scheduler. - - Replica in EMPTY status received a broadcasted recover request - -### Storage Performance Considerations - -See [this document](scheduler-storage.md) for scheduler storage performance considerations. - -### Network considerations -The Aurora scheduler listens on 2 ports - an HTTP port used for client RPCs and a web UI, -and a libprocess (HTTP+Protobuf) port used to communicate with the Mesos master and for the log -replication protocol. These can be left unconfigured (the scheduler publishes all selected ports -to ZooKeeper) or explicitly set in the startup script as follows: - - # ... - AURORA_FLAGS=( - # ... - -http_port=8081 - # ... - ) - # ... - export LIBPROCESS_PORT=8083 - # ... - -### Considerations for running jobs in docker containers -In order for Aurora to launch jobs using docker containers, a few extra configuration options -must be set. The [docker containerizer](http://mesos.apache.org/documentation/latest/docker-containerizer/) -must be enabled on the mesos slaves by launching them with the `--containerizers=docker,mesos` option. - -By default, Aurora will configure Mesos to copy the file specified in `-thermos_executor_path` -into the container's sandbox. If using a wrapper script to launch the thermos executor, -specify the path to the wrapper in that argument. In addition, the path to the executor pex itself -must be included in the `-thermos_executor_resources` option. Doing so will ensure that both the -wrapper script and executor are correctly copied into the sandbox. Finally, ensure the wrapper -script does not access resources outside of the sandbox, as when the script is run from within a -docker container those resources will not exist. - -In order to correctly execute processes inside a job, the docker container must have python 2.7 -installed. - -A scheduler flag, `-global_container_mounts` allows mounting paths from the host (i.e., the slave) -into all containers on that host. The format is a comma separated list of host_path:container_path[:mode] -tuples. For example `-global_container_mounts=/opt/secret_keys_dir:/mnt/secret_keys_dir:ro` mounts -`/opt/secret_keys_dir` from the slaves into all launched containers. Valid modes are `ro` and `rw`. - -If you would like to supply your own parameters to `docker run` when launching jobs in docker -containers, you may use the following flags: - - -allow_docker_parameters - -default_docker_parameters - -`-allow_docker_parameters` controls whether or not users may pass their own configuration parameters -through the job configuration files. If set to `false` (the default), the scheduler will reject -jobs with custom parameters. *NOTE*: this setting should be used with caution as it allows any job -owner to specify any parameters they wish, including those that may introduce security concerns -(`privileged=true`, for example). - -`-default_docker_parameters` allows a cluster operator to specify a universal set of parameters that -should be used for every container that does not have parameters explicitly configured at the job -level. The argument accepts a multimap format: - - -default_docker_parameters="read-only=true,tmpfs=/tmp,tmpfs=/run" - -### Process Logs - -#### Log destination -By default, Thermos will write process stdout/stderr to log files in the sandbox. Process object configuration -allows specifying alternate log file destinations like streamed stdout/stderr or suppression of all log output. -Default behavior can be configured for the entire cluster with the following flag (through the `-thermos_executor_flags` -argument to the Aurora scheduler): - - --runner-logger-destination=both - -`both` configuration will send logs to files and stream to parent stdout/stderr outputs. - -See [this document](configuration-reference.md#logger) for all destination options. - -#### Log rotation -By default, Thermos will not rotate the stdout/stderr logs from child processes and they will grow -without bound. An individual user may change this behavior via configuration on the Process object, -but it may also be desirable to change the default configuration for the entire cluster. -In order to enable rotation by default, the following flags can be applied to Thermos (through the --thermos_executor_flags argument to the Aurora scheduler): - - --runner-logger-mode=rotate - --runner-rotate-log-size-mb=100 - --runner-rotate-log-backups=10 - -In the above example, each instance of the Thermos runner will rotate stderr/stdout logs once they -reach 100 MiB in size and keep a maximum of 10 backups. If a user has provided a custom setting for -their process, it will override these default settings. - -## Running Aurora -Configure a supervisor like [Monit](http://mmonit.com/monit/) or -[supervisord](http://supervisord.org/) to run the created `scheduler.sh` file and restart it -whenever it fails. Aurora expects to be restarted by an external process when it fails. Aurora -supports an active health checking protocol on its admin HTTP interface - if a `GET /health` times -out or returns anything other than `200 OK` the scheduler process is unhealthy and should be -restarted. - -For example, monit can be configured with - - if failed port 8081 send "GET /health HTTP/1.0\r\n" expect "OK\n" with timeout 2 seconds for 10 cycles then restart - -assuming you set `-http_port=8081`. - -## Security Considerations - -See [security.md](security.md). - -## Configuring Resource Oversubscription - -**WARNING**: This feature is currently in alpha status. Do not use it in production clusters! -See [this document](configuration-reference.md#revocable-jobs) for more feature details. - -Set these scheduler flag to allow receiving revocable Mesos offers: - - -receive_revocable_resources=true - -Specify a tier configuration file path: - - -tier_config=path/to/tiers/config.json - -Default [tier configuration file](../src/main/resources/org/apache/aurora/scheduler/tiers.json). - -### Maintaining an Aurora Installation - -### Monitoring -Please see our dedicated [monitoring guide](monitoring.md) for in-depth discussion on monitoring. - -### Running stateful services -Aurora is best suited to run stateless applications, but it also accommodates for stateful services -like databases, or services that otherwise need to always run on the same machines. - -#### Dedicated attribute -The Mesos slave has the `--attributes` command line argument which can be used to mark a slave with -static attributes (not to be confused with `--resources`, which are dynamic and accounted). - -Aurora makes these attributes available for matching with scheduling -[constraints](configuration-reference.md#specifying-scheduling-constraints). Most of these -constraints are arbitrary and available for custom use. There is one exception, though: the -`dedicated` attribute. Aurora treats this specially, and only allows matching jobs to run on these -machines, and will only schedule matching jobs on these machines. - -See the [section](resources.md#resource-quota) about resource quotas to learn how quotas apply to -dedicated jobs. - -##### Syntax -The dedicated attribute has semantic meaning. The format is `$role(/.*)?`. When a job is created, -the scheduler requires that the `$role` component matches the `role` field in the job -configuration, and will reject the job creation otherwise. The remainder of the attribute is -free-form. We've developed the idiom of formatting this attribute as `$role/$job`, but do not -enforce this. For example: a job `devcluster/www-data/prod/hello` with a dedicated constraint set as -`www-data/web.multi` will have its tasks scheduled only on Mesos slaves configured with: -`--attributes=dedicated:www-data/web.multi`. - -A wildcard (`*`) may be used for the role portion of the dedicated attribute, which will allow any -owner to elect for a job to run on the host(s). For example: tasks from both -`devcluster/www-data/prod/hello` and `devcluster/vagrant/test/hello` with a dedicated constraint -formatted as `*/web.multi` will be scheduled only on Mesos slaves configured with -`--attributes=dedicated:*/web.multi`. This may be useful when assembling a virtual cluster of -machines sharing the same set of traits or requirements. - -##### Example -Consider the following slave command line: - - mesos-slave --attributes="dedicated:db_team/redis" ... - -And this job configuration: - - Service( - name = 'redis', - role = 'db_team', - constraints = { - 'dedicated': 'db_team/redis' - } - ... - ) - -The job configuration is indicating that it should only be scheduled on slaves with the attribute -`dedicated:db_team/redis`. Additionally, Aurora will prevent any tasks that do _not_ have that -constraint from running on those slaves. - -## Best practices -### Diversity -Data centers are often organized with hierarchical failure domains. Common failure domains -include hosts, racks, rows, and PDUs. If you have this information available, it is wise to tag -the mesos-slave with them as -[attributes](https://mesos.apache.org/documentation/attributes-resources/). - -When it comes time to schedule jobs, Aurora will automatically spread them across the failure -domains as specified in the -[job configuration](configuration-reference.md#specifying-scheduling-constraints). - -Note: in virtualized environments like EC2, the only attribute that usually makes sense for this -purpose is `host`. - -## Common problems -So you've started your first cluster and are running into some issues? We've collected some common -stumbling blocks and solutions here to help get you moving. - -### Replicated log not initialized - -#### Symptoms -- Scheduler RPCs and web interface claim `Storage is not READY` -- Scheduler log repeatedly prints messages like - - ``` - I1016 16:12:27.234133 26081 replica.cpp:638] Replica in EMPTY status - received a broadcasted recover request - I1016 16:12:27.234256 26084 recover.cpp:188] Received a recover response - from a replica in EMPTY status - ``` - -#### Solution -When you create a new cluster, you need to inform a quorum of schedulers that they are safe to -consider their database to be empty by [initializing](#initializing-the-replicated-log) the -replicated log. This is done to prevent the scheduler from modifying the cluster state in the event -of multiple simultaneous disk failures or, more likely, misconfiguration of the replicated log path. - -### Scheduler not registered - -#### Symptoms -Scheduler log contains - - Framework has not been registered within the tolerated delay. - -#### Solution -Double-check that the scheduler is configured correctly to reach the master. If you are registering -the master in ZooKeeper, make sure command line argument to the master: - - --zk=zk://$ZK_HOST:2181/mesos/master - -is the same as the one on the scheduler: - - -mesos_master_address=zk://$ZK_HOST:2181/mesos/master - -## Changing Scheduler Quorum Size -Special care needs to be taken when changing the size of the Aurora scheduler quorum. -Since Aurora uses a Mesos replicated log, similar steps need to be followed as when -[changing the mesos quorum size](http://mesos.apache.org/documentation/latest/operational-guide). - -### Preparation -Increase [-native_log_quorum_size](storage-config.md#-native_log_quorum_size) on each -existing scheduler and restart them. When updating from 3 to 5 schedulers, the quorum size -would grow from 2 to 3. - -### Adding New Schedulers -Start the new schedulers with `-native_log_quorum_size` set to the new value. Failing to -first increase the quorum size on running schedulers can in some cases result in corruption -or truncating of the replicated log used by Aurora. In that case, see the documentation on -[recovering from backup](storage-config.md#recovering-from-a-scheduler-backup). http://git-wip-us.apache.org/repos/asf/aurora/blob/f28f41a7/docs/design-documents.md ---------------------------------------------------------------------- diff --git a/docs/design-documents.md b/docs/design-documents.md deleted file mode 100644 index 4d14caa..0000000 --- a/docs/design-documents.md +++ /dev/null @@ -1,19 +0,0 @@ -# Design Documents - -Since its inception as an Apache project, larger feature additions to the -Aurora code base are discussed in form of design documents. Design documents -are living documents until a consensus has been reached to implement a feature -in the proposed form. - -Current and past documents: - -* [Command Hooks for the Aurora Client](design/command-hooks.md) -* [Health Checks for Updates](https://docs.google.com/document/d/1ZdgW8S4xMhvKW7iQUX99xZm10NXSxEWR0a-21FP5d94/edit) -* [JobUpdateDiff thrift API](https://docs.google.com/document/d/1Fc_YhhV7fc4D9Xv6gJzpfooxbK4YWZcvzw6Bd3qVTL8/edit) -* [REST API RFC](https://docs.google.com/document/d/11_lAsYIRlD5ETRzF2eSd3oa8LXAHYFD8rSetspYXaf4/edit) -* [Revocable Mesos offers in Aurora](https://docs.google.com/document/d/1r1WCHgmPJp5wbrqSZLsgtxPNj3sULfHrSFmxp2GyPTo/edit) -* [Supporting the Mesos Universal Containerizer](https://docs.google.com/document/d/111T09NBF2zjjl7HE95xglsDpRdKoZqhCRM5hHmOfTLA/edit?usp=sharing) -* [Tier Management In Apache Aurora](https://docs.google.com/document/d/1erszT-HsWf1zCIfhbqHlsotHxWUvDyI2xUwNQQQxLgs/edit?usp=sharing) -* [Ubiquitous Jobs](https://docs.google.com/document/d/12hr6GnUZU3mc7xsWRzMi3nQILGB-3vyUxvbG-6YmvdE/edit) - -Design documents can be found in the Aurora issue tracker via the query [`project = AURORA AND text ~ "docs.google.com" ORDER BY created`](https://issues.apache.org/jira/browse/AURORA-1528?jql=project%20%3D%20AURORA%20AND%20text%20~%20%22docs.google.com%22%20ORDER%20BY%20created). http://git-wip-us.apache.org/repos/asf/aurora/blob/f28f41a7/docs/design/command-hooks.md ---------------------------------------------------------------------- diff --git a/docs/design/command-hooks.md b/docs/design/command-hooks.md deleted file mode 100644 index 3f3f70f..0000000 --- a/docs/design/command-hooks.md +++ /dev/null @@ -1,102 +0,0 @@ -# Command Hooks for the Aurora Client - -## Introduction/Motivation - -We've got hooks in the client that surround API calls. These are -pretty awkward, because they don't correlate with user actions. For -example, suppose we wanted a policy that said users weren't allowed to -kill all instances of a production job at once. - -Right now, all that we could hook would be the "killJob" api call. But -kill (at least in newer versions of the client) normally runs in -batches. If a user called killall, what we would see on the API level -is a series of "killJob" calls, each of which specified a batch of -instances. We woudn't be able to distinguish between really killing -all instances of a job (which is forbidden under this policy), and -carefully killing in batches (which is permitted.) In each case, the -hook would just see a series of API calls, and couldn't find out what -the actual command being executed was! - -For most policy enforcement, what we really want to be able to do is -look at and vet the commands that a user is performing, not the API -calls that the client uses to implement those commands. - -So I propose that we add a new kind of hooks, which surround noun/verb -commands. A hook will register itself to handle a collection of (noun, -verb) pairs. Whenever any of those noun/verb commands are invoked, the -hooks methods will be called around the execution of the verb. A -pre-hook will have the ability to reject a command, preventing the -verb from being executed. - -## Registering Hooks - -These hooks will be registered via configuration plugins. A configuration plugin -can register hooks using an API. Hooks registered this way are, effectively, -hardwired into the client executable. - -The order of execution of hooks is unspecified: they may be called in -any order. There is no way to guarantee that one hook will execute -before some other hook. - - -### Global Hooks - -Commands registered by the python call are called _global_ hooks, -because they will run for all configurations, whether or not they -specify any hooks in the configuration file. - -In the implementation, hooks are registered in the module -`apache.aurora.client.cli.command_hooks`, using the class -`GlobalCommandHookRegistry`. A global hook can be registered by calling -`GlobalCommandHookRegistry.register_command_hook` in a configuration plugin. - -### The API - - class CommandHook(object) - @property - def name(self): - """Returns a name for the hook." - - def get_nouns(self): - """Return the nouns that have verbs that should invoke this hook.""" - - def get_verbs(self, noun): - """Return the verbs for a particular noun that should invoke his hook.""" - - @abstractmethod - def pre_command(self, noun, verb, context, commandline): - """Execute a hook before invoking a verb. - * noun: the noun being invoked. - * verb: the verb being invoked. - * context: the context object that will be used to invoke the verb. - The options object will be initialized before calling the hook - * commandline: the original argv collection used to invoke the client. - Returns: True if the command should be allowed to proceed; False if the command - should be rejected. - """ - - def post_command(self, noun, verb, context, commandline, result): - """Execute a hook after invoking a verb. - * noun: the noun being invoked. - * verb: the verb being invoked. - * context: the context object that will be used to invoke the verb. - The options object will be initialized before calling the hook - * commandline: the original argv collection used to invoke the client. - * result: the result code returned by the verb. - Returns: nothing - """ - - class GlobalCommandHookRegistry(object): - @classmethod - def register_command_hook(self, hook): - pass - -### Skipping Hooks - -To skip a hook, a user uses a command-line option, `--skip-hooks`. The option can either -specify specific hooks to skip, or "all": - -* `aurora --skip-hooks=all job create east/bozo/devel/myjob` will create a job - without running any hooks. -* `aurora --skip-hooks=test,iq create east/bozo/devel/myjob` will create a job, - and will skip only the hooks named "test" and "iq". http://git-wip-us.apache.org/repos/asf/aurora/blob/f28f41a7/docs/developing-aurora-client.md ---------------------------------------------------------------------- diff --git a/docs/developing-aurora-client.md b/docs/developing-aurora-client.md deleted file mode 100644 index 27f1c97..0000000 --- a/docs/developing-aurora-client.md +++ /dev/null @@ -1,93 +0,0 @@ -Getting Started -=============== - -The client is written in Python, and uses the -[Pants](http://pantsbuild.github.io/python-readme.html) build tool. - -Client Configuration -==================== - -The client uses a configuration file that specifies available clusters. More information about the -contents of this file can be found in the -[Client Cluster Configuration](client-cluster-configuration.md) documentation. Information about -how the client locates this file can be found in the -[Client Commands](client-commands.md#cluster-configuration) documentation. - -Building and Testing the Client -=============================== - -Building and testing the client code are both done using Pants. The relevant targets to know about -are: - - * Build a client executable: `./pants binary src/main/python/apache/aurora/client:aurora` - * Test client code: `./pants test src/test/python/apache/aurora/client/cli:all` - -If you want to build a source distribution of the client, you need to run `./build-support/release/make-python-sdists`. - -Running/Debugging the Client -============================ - -For manually testing client changes against a cluster, we use [Vagrant](https://www.vagrantup.com/). -To start a virtual cluster, you need to install Vagrant, and then run `vagrant up` for the root of -the aurora workspace. This will create a vagrant host named "devcluster", with a mesos master, a set -of mesos slaves, and an aurora scheduler. - -If you have a change you would like to test in your local cluster, you'll rebuild the client: - - vagrant ssh -c 'aurorabuild client' - -Once this completes, the `aurora` command will reflect your changes. - -Running/Debugging the Client in PyCharm -======================================= - -It's possible to use PyCharm to run and debug both the client and client tests in an IDE. In order -to do this, first run: - - build-support/python/make-pycharm-virtualenv - -This script will configure a virtualenv with all of our Python requirements. Once the script -completes it will emit instructions for configuring PyCharm: - - Your PyCharm environment is now set up. You can open the project root - directory with PyCharm. - - Once the project is loaded: - - open project settings - - click 'Project Interpreter' - - click the cog in the upper-right corner - - click 'Add Local' - - select 'build-support/python/pycharm.venv/bin/python' - - click 'OK' - -### Running/Debugging Tests - -After following these instructions, you should now be able to run/debug tests directly from the IDE -by right-clicking on a test (or test class) and choosing to run or debug: - -[![Debug Client Test](images/debug-client-test.png)](images/debug-client-test.png) - -If you've set a breakpoint, you can see the run will now stop and let you debug: - -[![Debugging Client Test](images/debugging-client-test.png)](images/debugging-client-test.png) - -### Running/Debugging the Client - -Actually running and debugging the client is unfortunately a bit more complex. You'll need to create -a Run configuration: - -* Go to Run → Edit Configurations -* Click the + icon to add a new configuration. -* Choose python and name the configuration 'client'. -* Set the script path to `/your/path/to/aurora/src/main/python/apache/aurora/client/cli/client.py` -* Set the script parameters to the command you want to run (e.g. `job status `) -* Expand the Environment section and click the ellipsis to add a new environment variable -* Click the + at the bottom to add a new variable named AURORA_CONFIG_ROOT whose value is the - path where the your cluster configuration can be found. For example, to talk to the scheduler - running in the vagrant image, it would be set to `/your/path/to/aurora/examples/vagrant` (this - is the directory where our example clusters.json is found). -* You should now be able to run and debug this configuration! - -Making thrift schema changes -============================ -See [this document](thrift-deprecation.md) for any thrift related changes. http://git-wip-us.apache.org/repos/asf/aurora/blob/f28f41a7/docs/developing-aurora-scheduler.md ---------------------------------------------------------------------- diff --git a/docs/developing-aurora-scheduler.md b/docs/developing-aurora-scheduler.md deleted file mode 100644 index a703871..0000000 --- a/docs/developing-aurora-scheduler.md +++ /dev/null @@ -1,163 +0,0 @@ -Java code in the aurora repo is built with [Gradle](http://gradle.org). - - -Prerequisite -============ - -When using Apache Aurora checked out from the source repository or the binary -distribution, the Gradle wrapper and JavaScript dependencies are provided. -However, you need to manually install them when using the source release -downloads: - -1. Install Gradle following the instructions on the [Gradle web site](http://gradle.org) -2. From the root directory of the Apache Aurora project generate the gradle -wrapper by running: - - gradle wrapper - - -Getting Started -=============== - -You will need Java 8 installed and on your `PATH` or unzipped somewhere with `JAVA_HOME` set. Then - - ./gradlew tasks - -will bootstrap the build system and show available tasks. This can take a while the first time you -run it but subsequent runs will be much faster due to cached artifacts. - -Running the Tests ------------------ -Aurora has a comprehensive unit test suite. To run the tests use - - ./gradlew build - -Gradle will only re-run tests when dependencies of them have changed. To force a re-run of all -tests use - - ./gradlew clean build - -Running the build with code quality checks ------------------------------------------- -To speed up development iteration, the plain gradle commands will not run static analysis tools. -However, you should run these before posting a review diff, and **always** run this before pushing a -commit to origin/master. - - ./gradlew build -Pq - -Running integration tests -------------------------- -To run the same tests that are run in the Apache Aurora continuous integration -environment: - - ./build-support/jenkins/build.sh - - -In addition, there is an end-to-end test that runs a suite of aurora commands -using a virtual cluster: - - ./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh - - - -Creating a bundle for deployment --------------------------------- -Gradle can create a zip file containing Aurora, all of its dependencies, and a launch script with - - ./gradlew distZip - -or a tar file containing the same files with - - ./gradlew distTar - -The output file will be written to `dist/distributions/aurora-scheduler.zip` or -`dist/distributions/aurora-scheduler.tar`. - -Developing Aurora Java code -=========================== - -Setting up an IDE ------------------ -Gradle can generate project files for your IDE. To generate an IntelliJ IDEA project run - - ./gradlew idea - -and import the generated `aurora.ipr` file. - -Adding or Upgrading a Dependency --------------------------------- -New dependencies can be added from Maven central by adding a `compile` dependency to `build.gradle`. -For example, to add a dependency on `com.example`'s `example-lib` 1.0 add this block: - - compile 'com.example:example-lib:1.0' - -NOTE: Anyone thinking about adding a new dependency should first familiarize themself with the -Apache Foundation's third-party licensing -[policy](http://www.apache.org/legal/resolved.html#category-x). - -Developing Aurora UI -====================== - -Installing bower (optional) ----------------------------- -Third party JS libraries used in Aurora (located at 3rdparty/javascript/bower_components) are -managed by bower, a JS dependency manager. Bower is only required if you plan to add, remove or -update JS libraries. Bower can be installed using the following command: - - npm install -g bower - -Bower depends on node.js and npm. The easiest way to install node on a mac is via brew: - - brew install node - -For more node.js installation options refer to https://github.com/joyent/node/wiki/Installation. - -More info on installing and using bower can be found at: http://bower.io/. Once installed, you can -use the following commands to view and modify the bower repo at -3rdparty/javascript/bower_components - - bower list - bower install - bower remove - bower update - bower help - -Faster Iteration in Vagrant ---------------------------- -The scheduler serves UI assets from the classpath. For production deployments this means the assets -are served from within a jar. However, for faster development iteration, the vagrant image is -configured to add the `scheduler` subtree of `/vagrant/dist/resources/main` to the head of -`CLASSPATH`. This path is configured as a shared filesystem to the path on the host system where -your Aurora repository lives. This means that any updates under `dist/resources/main/scheduler` in -your checkout will be reflected immediately in the UI served from within the vagrant image. - -The one caveat to this is that this path is under `dist` not `src`. This is because the assets must -be processed by gradle before they can be served. So, unfortunately, you cannot just save your local -changes and see them reflected in the UI, you must first run `./gradlew processResources`. This is -less than ideal, but better than having to restart the scheduler after every change. Additionally, -gradle makes this process somewhat easier with the use of the `--continuous` flag. If you run: -`./gradlew processResources --continuous` gradle will monitor the filesystem for changes and run the -task automatically as necessary. This doesn't quite provide hot-reload capabilities, but it does -allow for <5s from save to changes being visibile in the UI with no further action required on the -part of the developer. - -Developing the Aurora Build System -================================== - -Bootstrapping Gradle --------------------- -The following files were autogenerated by `gradle wrapper` using gradle 1.8's -[Wrapper](http://www.gradle.org/docs/1.8/dsl/org.gradle.api.tasks.wrapper.Wrapper.html) plugin and -should not be modified directly: - - ./gradlew - ./gradlew.bat - ./gradle/wrapper/gradle-wrapper.jar - ./gradle/wrapper/gradle-wrapper.properties - -To upgrade Gradle unpack the new version somewhere, run `/path/to/new/gradle wrapper` in the -repository root and commit the changed files. - -Making thrift schema changes -============================ -See [this document](thrift-deprecation.md) for any thrift related changes. http://git-wip-us.apache.org/repos/asf/aurora/blob/f28f41a7/docs/development/client.md ---------------------------------------------------------------------- diff --git a/docs/development/client.md b/docs/development/client.md new file mode 100644 index 0000000..a5fee37 --- /dev/null +++ b/docs/development/client.md @@ -0,0 +1,81 @@ +Developing the Aurora Client +============================ + +The client is written in Python, and uses the +[Pants](http://pantsbuild.github.io/python-readme.html) build tool. + + +Building and Testing +-------------------- + +Building and testing the client code are both done using Pants. The relevant targets to know about +are: + + * Build a client executable: `./pants binary src/main/python/apache/aurora/client:aurora` + * Test client code: `./pants test src/test/python/apache/aurora/client/cli:all` + +If you want to build a source distribution of the client, you need to run `./build-support/release/make-python-sdists`. + + +Running/Debugging +------------------ + +For manually testing client changes against a cluster, we use [Vagrant](https://www.vagrantup.com/). +To start a virtual cluster, you need to install Vagrant, and then run `vagrant up` for the root of +the aurora workspace. This will create a vagrant host named "devcluster", with a mesos master, a set +of mesos slaves, and an aurora scheduler. + +If you have a change you would like to test in your local cluster, you'll rebuild the client: + + vagrant ssh -c 'aurorabuild client' + +Once this completes, the `aurora` command will reflect your changes. + + +Running/Debugging in PyCharm +----------------------------- + +It's possible to use PyCharm to run and debug both the client and client tests in an IDE. In order +to do this, first run: + + build-support/python/make-pycharm-virtualenv + +This script will configure a virtualenv with all of our Python requirements. Once the script +completes it will emit instructions for configuring PyCharm: + + Your PyCharm environment is now set up. You can open the project root + directory with PyCharm. + + Once the project is loaded: + - open project settings + - click 'Project Interpreter' + - click the cog in the upper-right corner + - click 'Add Local' + - select 'build-support/python/pycharm.venv/bin/python' + - click 'OK' + +### Running/Debugging Tests +After following these instructions, you should now be able to run/debug tests directly from the IDE +by right-clicking on a test (or test class) and choosing to run or debug: + +[![Debug Client Test](../images/debug-client-test.png)](../images/debug-client-test.png) + +If you've set a breakpoint, you can see the run will now stop and let you debug: + +[![Debugging Client Test](../images/debugging-client-test.png)](../images/debugging-client-test.png) + +### Running/Debugging the Client +Actually running and debugging the client is unfortunately a bit more complex. You'll need to create +a Run configuration: + +* Go to Run → Edit Configurations +* Click the + icon to add a new configuration. +* Choose python and name the configuration 'client'. +* Set the script path to `/your/path/to/aurora/src/main/python/apache/aurora/client/cli/client.py` +* Set the script parameters to the command you want to run (e.g. `job status `) +* Expand the Environment section and click the ellipsis to add a new environment variable +* Click the + at the bottom to add a new variable named AURORA_CONFIG_ROOT whose value is the + path where the your cluster configuration can be found. For example, to talk to the scheduler + running in the vagrant image, it would be set to `/your/path/to/aurora/examples/vagrant` (this + is the directory where our example clusters.json is found). +* You should now be able to run and debug this configuration! http://git-wip-us.apache.org/repos/asf/aurora/blob/f28f41a7/docs/development/committers-guide.md ---------------------------------------------------------------------- diff --git a/docs/development/committers-guide.md b/docs/development/committers-guide.md new file mode 100644 index 0000000..70f67a6 --- /dev/null +++ b/docs/development/committers-guide.md @@ -0,0 +1,86 @@ +Committer's Guide +================= + +Information for official Apache Aurora committers. + +Setting up your email account +----------------------------- +Once your Apache ID has been set up you can configure your account and add ssh keys and setup an +email forwarding address at + + http://id.apache.org + +Additional instructions for setting up your new committer email can be found at + + http://www.apache.org/dev/user-email.html + +The recommended setup is to configure all services (mailing lists, JIRA, ReviewBoard) to send +emails to your @apache.org email address. + + +Creating a gpg key for releases +------------------------------- +In order to create a release candidate you will need a gpg key published to an external key server +and that key will need to be added to our KEYS file as well. + +1. Create a key: + + gpg --gen-key + +2. Add your gpg key to the Apache Aurora KEYS file: + + git clone https://git-wip-us.apache.org/repos/asf/aurora.git + (gpg --list-sigs && gpg --armor --export ) >> KEYS + git add KEYS && git commit -m "Adding gpg key for " + ./rbt post -o -g + +3. Publish the key to an external key server: + + gpg --keyserver pgp.mit.edu --send-keys + +4. Update the changes to the KEYS file to the Apache Aurora svn dist locations listed below: + + https://dist.apache.org/repos/dist/dev/aurora/KEYS + https://dist.apache.org/repos/dist/release/aurora/KEYS + +5. Add your key to git config for use with the release scripts: + + git config --global user.signingkey + + +Creating a release +------------------ +The following will guide you through the steps to create a release candidate, vote, and finally an +official Apache Aurora release. Before starting your gpg key should be in the KEYS file and you +must have access to commit to the dist.a.o repositories. + +1. Ensure that all issues resolved for this release candidate are tagged with the correct Fix +Version in Jira, the changelog script will use this to generate the CHANGELOG in step #2. + +2. Create a release candidate. This will automatically update the CHANGELOG and commit it, create a +branch and update the current version within the trunk. To create a minor version update and publish +it run + + ./build-support/release/release-candidate -l m -p + +3. Update, if necessary, the draft email created from the `release-candidate` script in step #2 and +send the [VOTE] email to the dev@ mailing list. You can verify the release signature and checksums +by running + + ./build-support/release/verify-release-candidate + +4. Wait for the vote to complete. If the vote fails close the vote by replying to the initial [VOTE] +email sent in step #3 by editing the subject to [RESULT][VOTE] ... and noting the failure reason +(example [here](http://markmail.org/message/d4d6xtvj7vgwi76f)). Now address any issues and go back to +step #1 and run again, this time you will use the -r flag to increment the release candidate +version. This will automatically clean up the release candidate rc0 branch and source distribution. + + ./build-support/release/release-candidate -l m -r 1 -p + +5. Once the vote has successfully passed create the release + + ./build-support/release/release + +6. Update the draft email created fom the `release` script in step #5 to include the Apache ID's for +all binding votes and send the [RESULT][VOTE] email to the dev@ mailing list. + http://git-wip-us.apache.org/repos/asf/aurora/blob/f28f41a7/docs/development/design-documents.md ---------------------------------------------------------------------- diff --git a/docs/development/design-documents.md b/docs/development/design-documents.md new file mode 100644 index 0000000..b01cfd7 --- /dev/null +++ b/docs/development/design-documents.md @@ -0,0 +1,20 @@ +Design Documents +================ + +Since its inception as an Apache project, larger feature additions to the +Aurora code base are discussed in form of design documents. Design documents +are living documents until a consensus has been reached to implement a feature +in the proposed form. + +Current and past documents: + +* [Command Hooks for the Aurora Client](design/command-hooks.md) +* [Health Checks for Updates](https://docs.google.com/document/d/1ZdgW8S4xMhvKW7iQUX99xZm10NXSxEWR0a-21FP5d94/edit) +* [JobUpdateDiff thrift API](https://docs.google.com/document/d/1Fc_YhhV7fc4D9Xv6gJzpfooxbK4YWZcvzw6Bd3qVTL8/edit) +* [REST API RFC](https://docs.google.com/document/d/11_lAsYIRlD5ETRzF2eSd3oa8LXAHYFD8rSetspYXaf4/edit) +* [Revocable Mesos offers in Aurora](https://docs.google.com/document/d/1r1WCHgmPJp5wbrqSZLsgtxPNj3sULfHrSFmxp2GyPTo/edit) +* [Supporting the Mesos Universal Containerizer](https://docs.google.com/document/d/111T09NBF2zjjl7HE95xglsDpRdKoZqhCRM5hHmOfTLA/edit?usp=sharing) +* [Tier Management In Apache Aurora](https://docs.google.com/document/d/1erszT-HsWf1zCIfhbqHlsotHxWUvDyI2xUwNQQQxLgs/edit?usp=sharing) +* [Ubiquitous Jobs](https://docs.google.com/document/d/12hr6GnUZU3mc7xsWRzMi3nQILGB-3vyUxvbG-6YmvdE/edit) + +Design documents can be found in the Aurora issue tracker via the query [`project = AURORA AND text ~ "docs.google.com" ORDER BY created`](https://issues.apache.org/jira/browse/AURORA-1528?jql=project%20%3D%20AURORA%20AND%20text%20~%20%22docs.google.com%22%20ORDER%20BY%20created).