cloudstack-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
Subject git commit: updated refs/heads/ACS101 to c7243d4
Date Mon, 10 Jun 2013 12:25:27 GMT
Updated Branches:
  refs/heads/ACS101 59a58bcf9 -> c7243d4da

more docs on knife and whirr


Branch: refs/heads/ACS101
Commit: c7243d4da23b1cf01326305386b40e1da88a509a
Parents: 59a58bc
Author: Sebastien Goasguen <>
Authored: Mon Jun 10 08:25:14 2013 -0400
Committer: Sebastien Goasguen <>
Committed: Mon Jun 10 08:25:14 2013 -0400

 docs/acs101/en-US/The_Little__Book.xml |   1 -
 docs/acs101/en-US/knife-cloudstack.xml | 109 ++++++++++++++---
 docs/acs101/en-US/whirr.xml            | 176 ++++++++++++++++++++++++++--
 3 files changed, 262 insertions(+), 24 deletions(-)
diff --git a/docs/acs101/en-US/The_Little__Book.xml b/docs/acs101/en-US/The_Little__Book.xml
index c2dda0f..6f1a08b 100644
--- a/docs/acs101/en-US/The_Little__Book.xml
+++ b/docs/acs101/en-US/The_Little__Book.xml
@@ -25,7 +25,6 @@
     <xi:include href="Book_Info.xml" xmlns:xi="" />
-    <xi:include href="Overview.xml" xmlns:xi="" />
     <xi:include href="Gettinggoing.xml" xmlns:xi="" />
     <xi:include href="Clientsandshells.xml" xmlns:xi=""
     <xi:include href="Wrappers.xml" xmlns:xi="" />
diff --git a/docs/acs101/en-US/knife-cloudstack.xml b/docs/acs101/en-US/knife-cloudstack.xml
index 4096792..a859311 100644
--- a/docs/acs101/en-US/knife-cloudstack.xml
+++ b/docs/acs101/en-US/knife-cloudstack.xml
@@ -211,35 +211,114 @@ foobar  178.170.XX.XX  m1.small  CentOS 6.4 - Minimal - 64bits  Running
     <section id="bootstrapping-chef">
-        <title>Bootstrapping Chef</title>
+        <title>Bootstrapping Instances with Hosted-Chef</title>
-            Knife is taking it's full potential when used to bootstrap Chef and use it for
configuration management of the instances. Below is an example that does so:
+            Knife is taking it's full potential when used to bootstrap Chef and use it for
configuration management of the instances. To get started with Chef, the easiest is to use
<ulink url="">Hosted Chef</ulink>. There is
some great documentation on <ulink url="">how</ulink>
to do it. The basics concept is that you will download or create cookbooks locally and publish
them to your own hosted Chef server.
+        </para>
+    </section>
+    <section id="boostrapping-knife">
+        <title>Using Knife with Hosted-Chef</title>
+        <para>
+            With your <emphasis>hosted Chef</emphasis> account created and your
local <emphasis>chef-repo</emphasis> setup, you can start instances on your Cloud
and specify the <emphasis>cookbooks</emphasis> to use to configure those instances.
The boostrapping process will fetch those cookbooks and configure the node. Below is an example
that does so, it uses the <ulink url="">exoscale</ulink>
cloud which runs on CloudStack. This cloud is enabled as a Basic zone and uses ssh keypairs
and security groups for access. 
-$ knife cs server create --service m1.small --template "CentOS 6.4 - Minimal - 64bits" --ipfwd-rules
22 --ssh-user sebgoa --ssh-password totoestcon foobar4
+$ knife cs server create --service Tiny --template "Linux CentOS 6.4 64-bit" --ssh-user root
--identity ~/.ssh/id_rsa --run-list "recipe[apache2]" --ssh-keypair foobar --security-group
www --no-public-ip foobar
-Waiting for Server to be created.......
+Waiting for Server to be created....
+Name:       foobar   
+Public IP:  185.19.XX.XX
+Waiting for sshd.....
+Name:         foobar13       
+Public IP:    185.19.XX.XX  
+Environment:  _default       
+Run List:     recipe[apache2]
+Bootstrapping Chef on 185.19.XX.XX  
+185.19.XX.XX  --2013-06-10 11:47:54--
+185.19.XX.XX  Resolving 
+185.19.XX.XX  184.ZZ.YY.YY
+185.19.XX.XX Connecting to|184.ZZ.XX.XX|:80... 
+185.19.XX.XX connected.
+185.19.XX.XX HTTP request sent, awaiting response... 
+185.19.XX.XX 301 Moved Permanently
+185.19.XX.XX Location: [following]
+185.19.XX.XX --2013-06-10 11:47:55--
+185.19.XX.XX Resolving 
+185.19.XX.XX 184.ZZ.YY.YY
+185.19.XX.XX Reusing existing connection to
+185.19.XX.XX HTTP request sent, awaiting response... 
+185.19.XX.XX 200 OK
+185.19.XX.XX Length: 6509 (6.4K) [application/x-sh]
+185.19.XX.XX Saving to: “STDOUT”
+ 0% [                                       ] 0           --.-K/s              
+100%[======================================>] 6,509       --.-K/s   in 0.1s    
+185.19.XX.XX 2013-06-10 11:47:55 (60.8 KB/s) - written to stdout [6509/6509]
+185.19.XX.XX Downloading Chef 11.4.4 for el...
+185.19.XX.XX Installing Chef 11.4.4
+            </programlisting>
+        </para>
+        <para>
+            Chef will then configure the machine based on the cookbook passed in the --run-list
option, here I setup a simple webserver. Note the keypair that I used and the security group.
I also specify <emphasis>--no-public-ip</emphasis> which disables the IP address
allocation and association. This is specific to the setup of <emphasis>exoscale</emphasis>
which automatically uses a public IP address for the instances.
+        </para>
+        <note>
+            <para>
+                The latest version of knife-cloudstack allows you to manage keypairs and
securitygroups. For instance listing, creation and deletion of keypairs is possible, as well
as listing of securitygroups:
+                <programlisting>
+$ knife cs securitygroup list
+Name     Description             Account         
+default  Default Security Group
+www      apache server 
+$ knife cs keypair list
+Name      Fingerprint                                    
+exoscale  xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx
+                </programlisting>
+            </para>
+        </note>
+        <para>
+            When using a &PRODUCT; based cloud in an Advanced zone setting, <emphasis>knife</emphasis>
can automatically allocate and associate an IP address. To illustrate this slightly different
example I use <ulink url="">iKoula</ulink> a french Cloud
Provider which uses &PRODUCT;. I edit my <emphasis>knife.rb</emphasis> file
to setup a different endpoint and the different API and secret keys. I remove the keypair,
security group and public ip option and I do not specify an identity file as I will retrieve
the ssh password with the <emphasis>--cloudstack-password</emphasis> option. The
example is as follows:
+        </para>
+        <para>
+            <programlisting>
+$ knife cs server create --service m1.small --template "CentOS 6.4 - Minimal - 64bits" --ssh-user
root --cloudstack-password --run-list "recipe[apache2]" foobar
+Waiting for Server to be created........
 Allocate ip address, create forwarding rules
 params: {"command"=>"associateIpAddress", "zoneId"=>"a41b82a0-78d8-4a8f-bb79-303a791bb8a7",
-Allocated IP Address:
-Name:       foobar4       
-Public IP:
+Allocated IP Address:
+Name:       foobar       
+Password:   $%@#$%#$%#$     
+Public IP:  178.xx.yy.zz
-Waiting for sshd.....
+Waiting for sshd......
-Name:         foobar4       
-Public IP:
-Environment:  _default      
-Run List:                   
+Name:         foobar     
+Public IP:    178.xx.yy.zz 
+Environment:  _default       
+Run List:     recipe[apache2]
-Bootstrapping Chef on
-ERROR: Errno::ENOENT: No such file or directory - /etc/chef/validation.pem
+Bootstrapping Chef on 178.xx.yy.zz
+178.xx.yy.zz --2013-06-10 13:24:29--
+178.xx.yy.zz Resolving
+        <warning>
+            <para>
+                You will want to review the security implications of doing the boostrap as
root and using the default password to do so.
+            </para>
+        </warning>
+        <para>
+            With the basics of Chef configuration and usage covered, as well as the basic
examples of using <emphasis>knife cloudstack</emphasis> to provision and configure
instances we can now move on to the interesting case of configuring a group of instances and
handling basic dependencies between those machines. The typical use case is when machines
need to be provisioned according to a schedule so that information about one instance can
be passed to another one. With <emphasis>knife cloudstack</emphasis> this is possible
by using <emphasis>stacks</emphasis>.
+        </para>
diff --git a/docs/acs101/en-US/whirr.xml b/docs/acs101/en-US/whirr.xml
index c3f3123..4b14a1a 100644
--- a/docs/acs101/en-US/whirr.xml
+++ b/docs/acs101/en-US/whirr.xml
@@ -24,12 +24,172 @@
 <section id="whirr">
     <title>Apache Whirr</title>
-    <para>Whirr is ta stream processing analysis framework provides a <emphasis>Cloud
in a box</emphasis>.</para>
-    <note>
-        <para>DevCloud is provided as a convenience by community members. It is not
an official &PRODUCT; release artifact.</para>
-        <para>The &PRODUCT; source code however, contains tools to build your own
-    </note>
-    <warning>
-        <para>Storm is </para>
-    </warning>
+    <para><ulink url="">Apache Whirr</ulink> is
a set of libraries to run cloud services, internally it uses <ulink url="">jclouds</ulink>
that we introduced earlier via the jclouds-cli interface to &PRODUCT;, it is java based
and of interest to provision clusters of virtual machines on cloud providers. Historically
it started as a set of scripts to deploy <ulink url="">Hadoop</ulink>
clusters on Amazon EC2. We introduce Whirr has a potential &PRODUCT; tool to provision
Hadoop cluster on &PRODUCT; based clouds.</para>
+    <section id="whirr-install">
+    <title>Installing Apache Whirr</title>
+        <para>
+            To install Whirr you can follow the <ulink url="">Quick
Start Guide</ulink>, download a tarball or clone the git repository. In the spirit of
this document we clone the repo:
+        </para>
+        <programlisting>
+git clone git://
+        </programlisting>
+        <para>
+            And build the source with maven that we now know and love...:
+        </para>
+        <programlisting>
+mvn install        
+        </programlisting>
+        <para>
+            The whirr binary will be available in the <emphasis>bin</emphasis>
directory that we can add to our path
+        </para>	
+        <programlisting>
+export PATH=$PATH:/Users/sebgoa/Documents/whirr/bin
+        </programlisting>
+        <para>
+            If all went well you should now be able to get the usage of <emphasis>whirr</emphasis>:
+        </para>
+        <programlisting>
+$ whirr --help
+Unrecognized command '--help'
+Usage: whirr COMMAND [ARGS]
+where COMMAND may be one of:
+  launch-cluster  Launch a new cluster running a service.
+  start-services  Start the cluster services.
+   stop-services  Stop the cluster services.
+restart-services  Restart the cluster services.
+ destroy-cluster  Terminate and cleanup resources for a running cluster.
+destroy-instance  Terminate and cleanup resources for a single instance.
+    list-cluster  List the nodes in a cluster.
+  list-providers  Show a list of the supported providers
+      run-script  Run a script on a specific instance or a group of instances matching a
role name
+         version  Print the version number and exit.
+            help  Show help about an action
+Available roles for instances:
+  cassandra
+  elasticsearch
+  ganglia-metad
+  ganglia-monitor
+  hadoop-datanode
+  hadoop-jobtracker
+  hadoop-namenode
+  hadoop-tasktracker
+  hama-groomserver
+  hama-master
+  hbase-avroserver
+  hbase-master
+  hbase-regionserver
+  hbase-restserver
+  hbase-thriftserver
+  kerberosclient
+  kerberosserver
+  mahout-client
+  mapreduce-historyserver
+  noop
+  pig-client
+  puppet-install
+  solr
+  yarn-nodemanager
+  yarn-resourcemanager
+  zookeeper
+        </programlisting>
+        <para>
+            From the look of the usage you clearly see that <emphasis>whirr</emphasis>
is about more than just <emphasis>hadoop</emphasis> and that it can be used to
configure <emphasis>elasticsearch</emphasis> clusters, <emphasis>cassandra</emphasis>
databases as well as the entire <emphasis>hadoop</emphasis> ecosystem with <emphasis>mahout</emphasis>,
<emphasis>pig</emphasis>, <emphasis>hbase</emphasis>, <emphasis>hama</emphasis>,
<emphasis>mapreduce</emphasis> and <emphasis>yarn</emphasis>.
+        </para>
+    </section>
+    <section id="whirr-use">
+    <title>Using Apache Whirr</title>
+        <para>
+            To get started with Whirr you need to setup the credentials and endpoint of your
&PRODUCT; based cloud that you will be using. Edit the <emphasis>~/.whirr/credentials</emphasis>
file to include a PROVIDER, IDENTITY, CREDENTIAL and ENDPOINT. The PROVIDER needs to be set
to <emphasis>cloudstack</emphasis>, the IDENTITY is your API key, the CREDENTIAL
is your secret key and the ENDPPOINT is the endpoint url. For instance:
+        </para>
+        <para>
+        <programlisting>
+        </programlisting>
+        </para>
+        <para>
+            With the credentials and endpoint defined you can create a <emphasis>properties</emphasis>
file that describes the cluster you want to launch on your cloud. The file contains information
such as the cluster name, the number of instances and their type, the distribution of hadoop
you want to use, the service offering id and the template id of the instances. It also defines
the ssh keys to be used for accessing the virtual machines. In the case of a cloud that uses
security groups, you may also need to specify it. A tricky point is the handling of DNS name
resolution. You might have to use the <emphasis></emphasis>
key to bypass any DNS issues. For a full description of the whirr property keys, see the <ulink
+        </para>
+        <para>
+        <programlisting>
+$ more 
+# Setup an Apache Hadoop Cluster
+# Change the cluster name here
+# Change the name of cluster admin user
+# Change the number of machines in the cluster here
+whirr.instance-templates=1 hadoop-namenode+hadoop-jobtracker,3 hadoop-datanode+hadoop-tasktracker
+# Uncomment out the following two lines to run CDH
+        </programlisting>
+        </para>
+        <para>
+            You are now ready to launch an hadoop cluster:
+        </para>
+        <para>
+        <programlisting>
+$ whirr launch-cluster --config 
+Running on provider cloudstack using identity mnH5EbKcKeJd456456345634563456345654634563456345
+Bootstrapping cluster
+Configuring template for bootstrap-hadoop-datanode_hadoop-tasktracker
+Configuring template for bootstrap-hadoop-namenode_hadoop-jobtracker
+Starting 3 node(s) with roles [hadoop-datanode, hadoop-tasktracker]
+Starting 1 node(s) with roles [hadoop-namenode, hadoop-jobtracker]
+>> running InitScript{INSTANCE_NAME=bootstrap-hadoop-datanode_hadoop-tasktracker} on
+>> running InitScript{INSTANCE_NAME=bootstrap-hadoop-datanode_hadoop-tasktracker} on
+>> running InitScript{INSTANCE_NAME=bootstrap-hadoop-datanode_hadoop-tasktracker} on
+>> running InitScript{INSTANCE_NAME=bootstrap-hadoop-namenode_hadoop-jobtracker} on
+<< success executing InitScript{INSTANCE_NAME=bootstrap-hadoop-datanode_hadoop-tasktracker}
on node(b9457a87-5890-4b6f-9cf3-1ebd1581f725): {output=This function does nothing. It just
needs to exist so"retry_helpers") doesn't call something which doesn't exist
+Get:1 precise-security Release.gpg [198 B]
+Get:2 precise-security Release [49.6 kB]
+Hit precise Release.gpg
+Get:3 precise-updates Release.gpg [198 B]
+Get:4 precise-backports Release.gpg [198 B]
+Hit precise Release
+You can log into instances using the following ssh commands:
+[hadoop-datanode+hadoop-tasktracker]: ssh -i /Users/sebastiengoasguen/.ssh/id_rsa -o "UserKnownHostsFile
/dev/null" -o StrictHostKeyChecking=no sebastiengoasguen@185.xx.yy.zz
+[hadoop-datanode+hadoop-tasktracker]: ssh -i /Users/sebastiengoasguen/.ssh/id_rsa -o "UserKnownHostsFile
/dev/null" -o StrictHostKeyChecking=no sebastiengoasguen@185.zz.zz.rr
+[hadoop-datanode+hadoop-tasktracker]: ssh -i /Users/sebastiengoasguen/.ssh/id_rsa -o "UserKnownHostsFile
/dev/null" -o StrictHostKeyChecking=no
+[hadoop-namenode+hadoop-jobtracker]: ssh -i /Users/sebastiengoasguen/.ssh/id_rsa -o "UserKnownHostsFile
/dev/null" -o StrictHostKeyChecking=no sebastiengoasguen@185.ii.oo.pp
+To destroy cluster, run 'whirr destroy-cluster' with the same options used to launch it.
+        </programlisting>
+        </para>
+        <para>
+            After the boostrapping process finishes you should be able to login to your instances
and use <emphasis>hadoop</emphasis> or if you are running a proxy on your machine,
you will be able to access your hadoop cluster locally. Testing of Whirr for &PRODUCT;
is still under <ulink url="">investigation</ulink>
and the subject of a Google Summer of Code 2013 project. More information will be added as
we learn them.
+        </para>
+    </section>

View raw message