asterixdb-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yingyi Bu (Code Review)" <do-not-re...@asterixdb.incubator.apache.org>
Subject Change in asterixdb[master]: Add documentation for Ansible and AWS installation options.
Date Tue, 14 Mar 2017 01:27:46 GMT
Yingyi Bu has uploaded a new change for review.

  https://asterix-gerrit.ics.uci.edu/1576

Change subject: Add documentation for Ansible and AWS installation options.
......................................................................

Add documentation for Ansible and AWS installation options.

Change-Id: I0036823392ab6dde8bddbce8b141aaf166f4e3ca
---
M README.md
A asterixdb/asterix-doc/src/site/markdown/ansible.md
A asterixdb/asterix-doc/src/site/markdown/aws.md
M asterixdb/asterix-doc/src/site/markdown/index.md
M asterixdb/asterix-doc/src/site/markdown/ncservice.md
M asterixdb/asterix-doc/src/site/site.xml
6 files changed, 350 insertions(+), 34 deletions(-)


  git pull ssh://asterix-gerrit.ics.uci.edu:29418/asterixdb refs/changes/76/1576/1

diff --git a/README.md b/README.md
index 792e355..105286d 100644
--- a/README.md
+++ b/README.md
@@ -22,28 +22,28 @@
 
 AsterixDB is a BDMS (Big Data Management System) with a rich feature set that sets it apart
from other Big Data platforms.  Its feature set makes it well-suited to modern needs such
as web data warehousing and social data storage and analysis. AsterixDB has:
 
-- __Data model__</br>
+- __Data model__<br/>
 A semistructured NoSQL style data model (ADM) resulting from extending JSON with object database
ideas
 
-- __Query languages__</br>
+- __Query languages__<br/>
 Two expressive and declarative query languages (SQL++ and AQL) that support a broad range
of queries and analysis over semistructured data
 
-- __Scalability__</br>
+- __Scalability__<br/>
 A parallel runtime query execution engine, Apache Hyracks, that has been scale-tested on
up to 1000+ cores and 500+ disks
 
-- __Native storage__</br>
+- __Native storage__<br/>
 Partitioned LSM-based data storage and indexing to support efficient ingestion and management
of semistructured data
 
-- __External storage__</br>
+- __External storage__<br/>
 Support for query access to externally stored data (e.g., data in HDFS) as well as to data
stored natively by AsterixDB
 
-- __Data types__</br>
+- __Data types__<br/>
 A rich set of primitive data types, including spatial and temporal data in addition to integer,
floating point, and textual data
 
-- __Indexing__</br>
+- __Indexing__<br/>
 Secondary indexing options that include B+ trees, R trees, and inverted keyword (exact and
fuzzy) index types
 
-- __Transactions__</br>
+- __Transactions__<br/>
 Basic transactional (concurrency and recovery) capabilities akin to those of a NoSQL store
 
 Learn more about AsterixDB at its [website](http://asterixdb.apache.org).
diff --git a/asterixdb/asterix-doc/src/site/markdown/ansible.md b/asterixdb/asterix-doc/src/site/markdown/ansible.md
new file mode 100644
index 0000000..16d8973
--- /dev/null
+++ b/asterixdb/asterix-doc/src/site/markdown/ansible.md
@@ -0,0 +1,136 @@
+<!--
+ ! Licensed to the Apache Software Foundation (ASF) under one
+ ! or more contributor license agreements.  See the NOTICE file
+ ! distributed with this work for additional information
+ ! regarding copyright ownership.  The ASF licenses this file
+ ! to you under the Apache License, Version 2.0 (the
+ ! "License"); you may not use this file except in compliance
+ ! with the License.  You may obtain a copy of the License at
+ !
+ !   http://www.apache.org/licenses/LICENSE-2.0
+ !
+ ! Unless required by applicable law or agreed to in writing,
+ ! software distributed under the License is distributed on an
+ ! "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ ! KIND, either express or implied.  See the License for the
+ ! specific language governing permissions and limitations
+ ! under the License.
+ !-->
+
+## <a id="toc">Table of Contents</a> ##
+
+* [Introduction](#Introduction)
+* [Prerequisites](#Prerequisites)
+* [Configuration and parameters](#config)
+* [Manage the lifecycle of your instance](#lifecycle)
+
+## <a id="Introduction">Introduction</a>
+This installation option wraps the basic, low-level installation binaries described in the
[NCService
+installation option](ncservice.html), and provides several simple scripts to deploy, start,
stop,
+and erase an AsterixDB instance on a cluster without requiring users to interact with each
individual
+node in the cluster.
+
+## <a id="Prerequisites">Prerequisites</a>
+  *  Supported operating systems: **Linux** and **MacOS**
+
+  *  Install pip on your client machine:
+
+          CentOS: sudo yum install python-pip
+          Ubuntu: sudo apt-get install python-pip
+          MacOS:  brew install pip
+
+  *  Install Ansible, boto, and boto3 on your client machine:
+
+          pip install ansible
+          pip install boto
+          pip install boto3
+
+     Make sure that the version of Ansible is no less than 2.2.1.0.
+
+  *  Configure passwordless ssh from your current client that runs the scripts to all nodes
listed in conf/inventory.
+
+  *  Download a released [simple server package](http://asterixdb.apache.org/download.html).
+
+     Alternatively, you can follow the [instruction](https://github.com/apache/asterixdb#build-from-source)
to
+     build from source.
+
+  *  In the extracted directory from the `simple server package`, navigate to `opt/ansible/`
+
+         $cd opt/ansible
+
+     The following files and directories are in the directory `opt/ansible`:
+
+         README  bin  conf  yaml
+
+     `bin` contains scripts that deploy, start, stop and erase an AsterixDB cluster instance,
according to
+     the configuration specified in files under `conf/`. `yaml` contains internal Ansible
scripts that the shell
+     scripts in `bin` use.
+
+## <a id="config">Configuration and parameters</a>
+  *  **Parameters**. Edit the instance configuration file `conf/cc.conf` when necessary.
+     You can add/update whatever parameters in the **[common]** and **[nc]** sections (except
IPs and ports).
+     For example:
+
+           [common]
+           log.level=INFO
+
+           [nc]
+           txn.log.dir=txnlog
+           iodevices=iodevice
+           command=asterixnc
+
+     More parameters and their usage can be found [here](ncservice.html#Parameters).
+     Note that with this installation option, all parameters in the **[cc]** section will
use defaults and cannot be
+     changed.
+
+
+  *  **Nodes and account**. Edit the inventory file `conf/inventory` when necessary.
+     You mostly only need to sepecify the node DNS names (or IPs) for the cluster controller,
i.e., the master node,
+     in the **[cc]** section, and node controllers, i.e., slave nodes, in the **[ncs]** section.
+     The following example configures a local "cluster" that only has one slave node (localhost)
and use
+     localhost as the master node too.
+
+          [cc]
+          localhost
+
+          [ncs]
+          localhost
+
+     If the ssh user account for target machines is different from your current username,
please uncomment
+     and edit the following two lines:
+
+           ;[all:vars]
+           ;ansible_ssh_user=<fill with your ssh account username>
+
+     If you want to specify advanced Ansible builtin variables, please refer to the following
Ansible documentation:
+     http://docs.ansible.com/ansible/intro_inventory.html.
+
+  *  **Remote working directories**. Edit `conf/instance_settings.yml` to change the instance
binary directories
+     when necessary. By default, the binary directory will be under the home directory (as
the value of
+     Ansible builtin variable ansible_env.HOME) of the ssh user account on each node.
+
+            # The parent directory for the working directory.
+            basedir: "{{ ansible_env.HOME }}"
+
+            # The working directory.
+            binarydir: "{{ basedir }}/{{ product }}"
+
+
+## <a id="lifecycle">Manage the lifecycle of your instance</a>
+  *  Deploy the binary to all nodes:
+
+         bin/deploy.sh
+
+  *  Launch your cluster instance:
+
+         bin/start.sh
+
+     Now you can use the cluster instance.
+
+  * If you want to stop the cluster instance, run the following script:
+
+         bin/stop.sh
+
+  * If you want to remove the binary on all nodes, run the following script:
+
+         bin/erase.sh
diff --git a/asterixdb/asterix-doc/src/site/markdown/aws.md b/asterixdb/asterix-doc/src/site/markdown/aws.md
new file mode 100644
index 0000000..18a51cb
--- /dev/null
+++ b/asterixdb/asterix-doc/src/site/markdown/aws.md
@@ -0,0 +1,170 @@
+<!--
+ ! Licensed to the Apache Software Foundation (ASF) under one
+ ! or more contributor license agreements.  See the NOTICE file
+ ! distributed with this work for additional information
+ ! regarding copyright ownership.  The ASF licenses this file
+ ! to you under the Apache License, Version 2.0 (the
+ ! "License"); you may not use this file except in compliance
+ ! with the License.  You may obtain a copy of the License at
+ !
+ !   http://www.apache.org/licenses/LICENSE-2.0
+ !
+ ! Unless required by applicable law or agreed to in writing,
+ ! software distributed under the License is distributed on an
+ ! "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ ! KIND, either express or implied.  See the License for the
+ ! specific language governing permissions and limitations
+ ! under the License.
+ !-->
+
+## <a id="toc">Table of Contents</a> ##
+
+* [Introduction](#Introduction)
+* [Prerequisites](#Prerequisites)
+* [Configuration](#config)
+* [Manage the lifecycle of your instance](#lifecycle)
+
+## <a id="Introduction">Introduction</a>
+   Note that you can always manually launch a number of Amazon Web Services EC2 instances
and then run the
+   Ansible cluster installation scripts as described [here](ansible.html) separately to manage
the
+   lifecycle of an AsterixDB instance on those EC2 instances.
+
+   However, via this installation option, we provide a combo solution for automating both
AWS EC2
+   and AsterixDB, where you can run only one script to start/stop an AsterixDB instance on
AWS.
+
+## <a id="Prerequisites">Prerequisites</a>
+  *  Supported operating systems for the client: **Linux** and **MacOS**
+
+  *  Supported operating systems for Amazon Web Services instances: **Linux**
+
+  *  Install pip on your client machine:
+
+            CentOS: sudo yum install python-pip
+            Ubuntu: sudo apt-get install python-pip
+            MacOS:  brew install pip
+
+  *  Install Ansible, boto, and boto3 on your client machine:
+
+            pip install ansible
+            pip install boto
+            pip install boto3
+
+     Make sure that the version of Ansible is no less than 2.2.1.0.
+
+  *  Download a released [simple server package](http://asterixdb.apache.org/download.html).
+
+     Alternatively, you can follow the [instruction](https://github.com/apache/asterixdb#build-from-source)
to
+     build from source.
+
+  *  In the extracted directory from the `simple server package`, navigate to `opt/aws/`
+
+            $cd opt/aws
+
+     The following files and directories are in the directory `opt/ansible`:
+
+            README  bin  conf  yaml
+
+     `bin` contains scripts that start and terminate an AWS-based cluster instance, according
to the configuration
+     specified in files under `conf/`. `yaml` contains internal Ansible scripts that the
shell scripts in `bin` use.
+
+  *  Create an AWS account and an IAM user.
+
+     Set up a security group that you'd like to use for your AWS cluster.
+     **The security group should at least allow all TCP connection from anywhere.**
+     Fill `group` in `conf/aws_settings.yml` by the name of the security group.
+
+  *  Retrieve your AWS EC2 key pair name and fill that for `keypair` `conf/aws_settings.yml`;
+
+     retrieve your AWS IAM `access key ID` and fill that for `access_key_id` in `conf/aws_settings.yml`;
+
+     retrieve your AWS IAM `secret access key` and fill that for `secret_access_key` in `conf/aws_settings.yml`.
+
+     Note that you can only read or download `access key ID` and `secret access key` once
from your AWS console.
+     If you forget them, you have to create new keys again and delete the old ones.
+
+  *  Configure your ssh setting by editing `~/.ssh/config` and adding the following entry:
+
+            Host *.amazonaws.com
+                  IdentityFile <path_of_private_key>
+
+     Note that \<path_of_private_key\> should be replaced by the path to the file that
stores the private key for the
+     key pair that you uploaded to AWS and used in `conf/aws_settings`. For example:
+
+            Host *.amazonaws.com
+                  IdentityFile ~/.ssh/id_rsa
+
+### <a id="config">Configuration</a>
+  * **AWS settings**.  Edit conf/instance_settings.yml. The meaning of each parameter is
listed as follows:
+
+            # The OS image id for ec2 instances.
+            image: ami-76fa4116
+
+            # The data center region for ec2 instances.
+            region: us-west-2
+
+            # The tag for each ec2 machine.
+            tag: scale_test
+
+            # The name of a security group that appears in your AWS console.
+            group: default
+
+            # The name of a key pair that appears in your AWS console.
+            keypair: <to be filled>
+
+            # The AWS access key id for your IAM user.
+            access_key_id: <to be filled>
+
+            # The AWS secrety key for your IAM user.
+            secret_access_key: <to be filled>
+
+            # The AWS instance type. A full list of available types are listed at:
+            # https://aws.amazon.com/ec2/instance-types/
+            instance_type: t2.micro
+
+            # The number of ec2 instances that construct a cluster.
+            count: 3
+
+            # The user name.
+            user: ec2-user
+
+            # Whether to reuse one nc machine to host cc.
+            cc_on_nc: false
+
+      **As described in [prerequisites](#Prerequisites), the following parameters must be
customized correctly:**
+
+            # The name of a security group that appears in your AWS console.
+            group: default
+
+            # The name of a key pair that appears in your AWS console.
+            keypair: <to be filled>
+
+            # The AWS access key id for your IAM user.
+            access_key_id: <to be filled>
+
+            # The AWS secrety key for your IAM user.
+            secret_access_key: <to be filled>
+
+  *  **Remote working directories**. Edit conf/instance_settings.yml to change the instance
binary directories
+     when necessary. By default, the binary directory will be under the home directory (as
the value of
+     Ansible builtin variable ansible_env.HOME) of the ssh user account on each node.
+
+            # The parent directory for the working directory.
+            basedir: "{{ ansible_env.HOME }}"
+
+            # The working directory.
+            binarydir: "{{ basedir }}/{{ product }}"
+
+
+### <a id="lifecycle">Manage the lifecycle of your instance</a>
+  *  Start an AWS-based cluster:
+
+            bin/start.sh
+
+     Now you can use the cluster instance through the public IP or DNS name of the master
node.
+
+  * If you want to stop the cluster instance, run the following script:
+
+            bin/stop.sh
+
+    Note that it will destroy everything in the cluster instance you installed and terminates
all AWS nodes
+    for the cluster.
diff --git a/asterixdb/asterix-doc/src/site/markdown/index.md b/asterixdb/asterix-doc/src/site/markdown/index.md
index bc7ca9c..303678d 100644
--- a/asterixdb/asterix-doc/src/site/markdown/index.md
+++ b/asterixdb/asterix-doc/src/site/markdown/index.md
@@ -19,26 +19,34 @@
 
 # AsterixDB #
 
-AsterixDB is a BDMS (Big Data Management System) with a rich feature set that
-sets it apart from other Big Data platforms.
-Its feature set makes it well-suited to modern needs such as web data
-warehousing and social data storage and analysis. AsterixDB has:
+AsterixDB is a BDMS (Big Data Management System) with a rich feature set that sets it apart
from other Big Data
+platforms. Its feature set makes it well-suited to modern needs such as web data warehousing
and social data
+storage and analysis. AsterixDB has:
 
- * A semistructured NoSQL style data model (ADM) resulting from extending JSON
-   with object database ideas
- * Two expressive and declarative query languages (SQL++ and AQL) that support a broad
-   range of queries and analysis over semistructured data
- * A parallel runtime query execution engine, Apache Hyracks, that has been
-   scale-tested on up to 1000+ cores and 500+ disks
- * Partitioned LSM-based data storage and indexing to support efficient
-   ingestion and management of semistructured data
- * Support for query access to externally stored data (e.g., data in HDFS) as
-   well as to data stored natively by AsterixDB
- * A rich set of primitive data types, including spatial and temporal data in
-   addition to integer, floating point, and textual data
- * Secondary indexing options that include B+ trees, R trees, and inverted
-   keyword (exact and fuzzy) index types
- * Support for fuzzy and spatial queries as well as for more traditional
-   parametric queries
- * Basic transactional (concurrency and recovery) capabilities akin to those of
-   a NoSQL store
+- __Data model__<br/>
+A semistructured NoSQL style data model ([ADM](datamodel.html)) resulting from extending
JSON with object database ideas
+
+- __Query languages__<br/>
+Two expressive and declarative query languages ([SQL++](sqlpp/manual.html) and [AQL](aql/manual.html))
that
+support a broad range of queries and analysis over semistructured data
+
+- __Scalability__<br/>
+A parallel runtime query execution engine, Apache Hyracks, that has been scale-tested on
up to 1000+ cores and
+500+ disks
+
+- __Native storage__<br/>
+Partitioned LSM-based data storage and indexing to support efficient ingestion and management
of semistructured data
+
+- __External storage__<br/>
+Support for query access to externally stored data (e.g., data in HDFS) as well as to data
stored natively by AsterixDB
+
+- __Data types__<br/>
+A rich set of primitive data types, including spatial and temporal data in addition to integer,
floating point,
+and textual data
+
+- __Indexing__<br/>
+Secondary indexing options that include B+ trees, R trees, and inverted keyword (exact and
fuzzy) index types
+
+- __Transactions__<br/>
+Basic transactional (concurrency and recovery) capabilities akin to those of a NoSQL store
+
diff --git a/asterixdb/asterix-doc/src/site/markdown/ncservice.md b/asterixdb/asterix-doc/src/site/markdown/ncservice.md
index 243d908..ad36a58 100644
--- a/asterixdb/asterix-doc/src/site/markdown/ncservice.md
+++ b/asterixdb/asterix-doc/src/site/markdown/ncservice.md
@@ -38,7 +38,7 @@
 
 This folder should contain 4 scripts, two pairs of `.sh` and `.bat` files
 respectively. `start-sample-cluster.sh` will simply start a basic sample cluster
-using the coniguration files located in `samples/local/conf/`.
+using the configuration files located in `samples/local/conf/`.
 
     user@localhost:~/a/o/l/bin
     $./start-sample-cluster.sh
@@ -374,7 +374,7 @@
 | common  | txn.log.partitionsize                     | N/A     | 268435456 (256 MB) |
 
 
-# For the optional NCService process configuration file, the following parameters, under
"[ncservice]" section.
+For the optional NCService process configuration file, the following parameters, under "[ncservice]"
section.
 
 | Parameter | Meaning |  Default |
 |----------|--------|-------|
diff --git a/asterixdb/asterix-doc/src/site/site.xml b/asterixdb/asterix-doc/src/site/site.xml
index 99c87e5..3e768bf 100644
--- a/asterixdb/asterix-doc/src/site/site.xml
+++ b/asterixdb/asterix-doc/src/site/site.xml
@@ -75,8 +75,10 @@
 
     <menu name="Get Started - Installation">
       <item name="Option 1: using NCService" href="ncservice.html"/>
-      <item name="Option 2: using Managix" href="install.html"/>
-      <item name="Option 3: using YARN" href="yarn.html"/>
+      <item name="Option 2: using Ansible" href="ansible.html"/>
+      <item name="Option 3: using Amazon Web Services" href="aws.html"/>
+      <item name="Option 4: using YARN" href="yarn.html"/>
+      <item name="Option 5: using Managix (deprecated)" href="install.html"/>
     </menu>
 
     <menu name = "AsterixDB Primer">

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/1576
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I0036823392ab6dde8bddbce8b141aaf166f4e3ca
Gerrit-PatchSet: 1
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Yingyi Bu <buyingyi@gmail.com>

Mime
View raw message