cloudstack-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From seb...@apache.org
Subject [1/2] git commit: updated refs/heads/ACS101 to 7ddf787
Date Sun, 23 Jun 2013 15:09:53 GMT
Updated Branches:
  refs/heads/ACS101 52787777e -> 7ddf787f6


finished whirr and starte saltcloud


Project: http://git-wip-us.apache.org/repos/asf/cloudstack/repo
Commit: http://git-wip-us.apache.org/repos/asf/cloudstack/commit/c30152c6
Tree: http://git-wip-us.apache.org/repos/asf/cloudstack/tree/c30152c6
Diff: http://git-wip-us.apache.org/repos/asf/cloudstack/diff/c30152c6

Branch: refs/heads/ACS101
Commit: c30152c63761fea194238e42e34d57444c2ad7fc
Parents: 5278777
Author: Sebastien Goasguen <runseb@gmail.com>
Authored: Sat Jun 22 10:37:45 2013 -0400
Committer: Sebastien Goasguen <runseb@gmail.com>
Committed: Sat Jun 22 10:37:45 2013 -0400

----------------------------------------------------------------------
 docs/acs101/en-US/Wrappers.xml  |   2 +-
 docs/acs101/en-US/saltcloud.xml | 106 +++++++++++++++++++++++++++++----
 docs/acs101/en-US/whirr.xml     | 111 ++++++++++++++++++++++++++++++++++-
 3 files changed, 207 insertions(+), 12 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/cloudstack/blob/c30152c6/docs/acs101/en-US/Wrappers.xml
----------------------------------------------------------------------
diff --git a/docs/acs101/en-US/Wrappers.xml b/docs/acs101/en-US/Wrappers.xml
index 70f2977..b963e94 100644
--- a/docs/acs101/en-US/Wrappers.xml
+++ b/docs/acs101/en-US/Wrappers.xml
@@ -26,7 +26,7 @@
 <chapter id="Wrappers">
   <title>Wrappers</title>
   <para>
-    This is a test paragraph
+    In this paragraph we introduce several &PRODUCT; <emphasis>wrappers</emphasis>.
These tools are using client libraries presented in the previous chapter and add additional
functionality that involve some high-level orchestration. For instance <emphasis>knife-cloudstack</emphasis>
uses the power of <ulink url="http://opscode.com">Chef</ulink>, the configuration
management system, to seamlessly bootstrap instances running in a &PRODUCT; cloud. Apache
<ulink url="http://whirr.apache.org">Whirr</ulink> uses <ulink url="http://jclouds.incubator.apache.org">jclouds</ulink>
to boostrap <ulink url="http://hadoop.apache.org">Hadoop</ulink> clusters in the
cloud and pallet does the same thing but using the clojure language.
   </para>
 
   <xi:include href="knife-cloudstack.xml" xmlns:xi="http://www.w3.org/2001/XInclude" />

http://git-wip-us.apache.org/repos/asf/cloudstack/blob/c30152c6/docs/acs101/en-US/saltcloud.xml
----------------------------------------------------------------------
diff --git a/docs/acs101/en-US/saltcloud.xml b/docs/acs101/en-US/saltcloud.xml
index fe6f306..9f531b9 100644
--- a/docs/acs101/en-US/saltcloud.xml
+++ b/docs/acs101/en-US/saltcloud.xml
@@ -22,14 +22,100 @@
  under the License.
 -->
 
-<section id="saltcloud">
-    <title>Saltcloud</title>
-    <para>Salt is an alternative to Chef and Puppet  ovides a <emphasis>Cloud
in a box</emphasis>.</para>
-    <note>
-        <para>DevCloud is provided as a convenience by community members. It is not
an official &PRODUCT; release artifact.</para>
-        <para>The &PRODUCT; source code however, contains tools to build your own
DevCloud.</para>
-    </note>
-    <warning>
-        <para>Storm is </para>
-    </warning>
+<section id="salt">
+    <title>Salt</title>
+    <para><ulink url="http://saltstack.com">Salt</ulink> is a configuration
management system written in Python. It can be seen as an alternative to Chef and Puppet.
Its concept is similar with a master node holding states called <emphasis>salt states
(SLS)</emphasis> and minions that get their configuration from the master. A nice difference
with Chef and Puppet is that Salt is also a remote execution engine and can be used to execute
commands on the minions by specifying a set of targets. In this chapter we introduce Salt
and dive into <ulink url="http://saltcloud.org">SaltCloud</ulink>, an open source
software to provision <emphasis>Salt</emphasis> masters and minions in the Cloud.
<emphasis>SaltCloud</emphasis> can be looked at as an alternative to <emphasis>knife-cs</emphasis>
but certainly with less functionality.
+	</para>
+
+    <section id="intro-to-salt">
+    <title>Quick Introduction to Salt</title>
+    <para>
+    </para>
+
+    </section>
+
+    <section id="salt-cloud">
+    <title>SaltCloud installation and usage.</title>
+	    <para>
+	        To install Saltcloud one simply clones the git repository. To develop Saltcloud,
just fork it on github and clone your fork, then commit patches and submit pull request. SaltCloud
depends on libcloud, therefore you will need libcloud installed as well. See the previous
chapter to setup libcloud. With Saltcloud installed and in your path, you need to define a
Cloud provider in <emphasis>~/.saltcloud/cloud</emphasis>. For example:
+	    </para>
+	    <programlisting>
+	<![CDATA[
+	providers:
+	  exoscale:
+	    apikey: <your api key> 
+	    secretkey: <your secret key>
+	    host: api.exoscale.ch
+	    path: /compute
+	    securitygroup: default
+	    user: root
+	    private_key: ~/.ssh/id_rsa
+	    provider: cloudstack
+	]]>
+	    </programlisting>
+	    <para>
+	        The apikey, secretkey, host, path and provider keys are mandatory. The securitygroup
key will specify which security group to use when starting the instances in that cloud. The
user will be the username used to connect to the instances via ssh and the private_key is
the ssh key to use. Note that the optional parameter are specific to the Cloud that this was
tested on. Cloud in advanced zones especially will need a different setup.
+	    </para>
+	    <warning><para>
+	        Saltcloud used libcloud. Support for advanced zones in libcloud is still experimental,
therefore using SaltCloud in advanced zone will likely need some development of libcloud.</para>
+	    </warning>
+		<para>
+	        Once a provider is defined, we can start using saltcloud to list the zones, the
service offerings and the templates available on that cloud provider. So far nothing more
than what libcloud provides. For example:
+	    </para>
+	    <programlisting>
+	$salt-cloud –list-locations exoscale
+	$salt-cloud –list-images exoscale
+	$salt-cloud –list-sizes exoscale
+	    </programlisting>
+	    <para>
+	        To start creating instances and configuring them with Salt, we need to define node
profiles in <emphasis>~/.saltcloud/config</emphasis>. To illustrate two different
profiles we show a Salt Master and a Minion. The Master would need a specific template (image:uuid),
a service offering or instance type (size:uuid). In a basic zone with keypair access and security
groups, one would also need to specify which keypair to use, where to listen for ssh connections
and of course you would need to define the provider (e.g exoscale in our case, defined above).
Below if the node profile for a Salt Master deployed in the Cloud:
+	    </para>
+	    <programlisting>
+	<![CDATA[
+	ubuntu-exoscale-master:
+	    provider: exoscale
+	    image: 1d16c78d-268f-47d0-be0c-b80d31e765d2 
+	    size: b6cd1ff5-3a2f-4e9d-a4d1-8988c1191fe8 
+	    ssh_interface: public
+	    ssh_username: root
+	    keypair: exoscale
+	    make_master: True
+	    master:
+	       user: root
+	       interface: 0.0.0.0
+	]]>
+	    </programlisting>
+	    <para>
+	        The master key shows which user to use and what interface, the make_master key if
set to true will boostrap this node as a Salt Master. To create it on our cloud provider simply
enter:
+	    </para>
+	    <programlisting>
+	$salt-cloud –p ubuntu-exoscale-master mymaster
+	    </programlisting>
+	    <para>
+	        Where <emphasis>mymaster</emphasis> is going to be the instance name.
To create a minion, add a minion node profile in the config file:
+	    </para>
+	    <programlisting>
+	<![CDATA[
+	ubuntu-exoscale-minion:
+	    provider: exoscale
+	    image: 1d16c78d-268f-47d0-be0c-b80d31e765d2
+	    size: b6cd1ff5-3a2f-4e9d-a4d1-8988c1191fe8
+	    ssh_interface: public
+	    ssh_username: root
+	    keypair: exoscale
+	]]>
+	    </programlisting>
+	    <para>
+	        you would then start it with:
+	    </para>
+	    <programlisting>
+	$salt-cloud –p ubuntu-exoscale-minion myminion
+	    </programlisting>
+	    <note>
+	        <para>Saltcloud is still in an early phase of development and has little concept
of dependencies between nodes. Therefore in the example described above the minion would not
know where the master is, this would need to be resolved by hand by passing the IP of the
master in the config profile of the minion. However this may not be a problem if the master
is already existent and reachable by the instances. 
+	        </para>
+	    </note>
+
+    </section>
+
 </section>

http://git-wip-us.apache.org/repos/asf/cloudstack/blob/c30152c6/docs/acs101/en-US/whirr.xml
----------------------------------------------------------------------
diff --git a/docs/acs101/en-US/whirr.xml b/docs/acs101/en-US/whirr.xml
index 4b14a1a..903c54c 100644
--- a/docs/acs101/en-US/whirr.xml
+++ b/docs/acs101/en-US/whirr.xml
@@ -153,6 +153,16 @@ whirr.endpoint=https://the/endpoint/url
 whirr.image-id=1d16c78d-268f-47d0-be0c-b80d31e765d2
         </programlisting>
         </para>
+        <warning>
+            <para>
+                The example shown above is specific to a production <ulink url="http://exoscale.ch">Cloud</ulink>
setup as a basic zone. This cloud uses security groups for isolation between instances. The
proper rules had to be setup by hand. Also note the use of <emphasis>whirr.store-cluster-in-etc-hosts</emphasis>.
If set to true whirr will edit the <emphasis>/etc/hosts</emphasis> file of the
nodes and enter the IP adresses. This is handy in the case where DNS resolution is problematic.
+            </para>
+        </warning>
+        <note>
+            <para>
+                To use the Cloudera Hadoop distribution (CDH) like in the example above,
you will need to copy the <emphasis>services/cdh/src/main/resources/functions</emphasis>
directory to the root of your Whirr source. In this directory you will find the bash scripts
used to bootstrap the instances. It may be handy to edit those scripts.
+            </para>
+        </note>
         <para>
             You are now ready to launch an hadoop cluster:
         </para>
@@ -188,8 +198,107 @@ To destroy cluster, run 'whirr destroy-cluster' with the same options
used to la
         </programlisting>
         </para>
         <para>
-            After the boostrapping process finishes you should be able to login to your instances
and use <emphasis>hadoop</emphasis> or if you are running a proxy on your machine,
you will be able to access your hadoop cluster locally. Testing of Whirr for &PRODUCT;
is still under <ulink url="https://issues.apache.org/jira/browse/WHIRR-725">investigation</ulink>
and the subject of a Google Summer of Code 2013 project. More information will be added as
we learn them.
+            After the boostrapping process finishes, you should be able to login to your
instances and use <emphasis>hadoop</emphasis> or if you are running a proxy on
your machine, you will be able to access your hadoop cluster locally. Testing of Whirr for
&PRODUCT; is still under <ulink url="https://issues.apache.org/jira/browse/WHIRR-725">investigation</ulink>
and the subject of a Google Summer of Code 2013 project. We currently identified issues with
the use of security groups. Moreover this was tested on a basic zone. Complete testing on
an advanced zone is future work.
         </para>
     </section>
 
+    <section id="using-map-reduce">
+    <title>Running Map-Reduce jobs on Hadoop</title>
+        <para>
+        Whirr gives you the ssh command to connect to the instances of your hadoop cluster,
login to the namenode and browse the hadoop file system that was created:
+        </para>
+        <programlisting>
+$ hadoop fs -ls /
+Found 5 items
+drwxrwxrwx   - hdfs supergroup          0 2013-06-21 20:11 /hadoop
+drwxrwxrwx   - hdfs supergroup          0 2013-06-21 20:10 /hbase
+drwxrwxrwx   - hdfs supergroup          0 2013-06-21 20:10 /mnt
+drwxrwxrwx   - hdfs supergroup          0 2013-06-21 20:11 /tmp
+drwxrwxrwx   - hdfs supergroup          0 2013-06-21 20:11 /user
+        </programlisting>
+		<para>Create a directory to put your input data</para>
+        <programlisting>
+$ hadoop fs -mkdir input
+$ hadoop fs -ls /user/sebastiengoasguen
+Found 1 items
+drwxr-xr-x   - sebastiengoasguen supergroup          0 2013-06-21 20:15 /user/sebastiengoasguen/input
+        </programlisting>
+        <para>Create a test input file and put in the hadoop file system:</para>
+		<programlisting>
+$ cat foobar 
+this is a test to count the words
+$ hadoop fs -put ./foobar input
+$ hadoop fs -ls /user/sebastiengoasguen/input
+Found 1 items
+-rw-r--r--   3 sebastiengoasguen supergroup         34 2013-06-21 20:17 /user/sebastiengoasguen/input/foobar
+        </programlisting>
+        <para>Define the map-reduce environment. Note that this default Cloudera distribution
installation uses MRv1. To use Yarn one would have to edit the hadoop.properties file.
+        </para>
+        <programlisting>
+$ export HADOOP_MAPRED_HOME=/usr/lib/hadoop-0.20-mapreduce
+        </programlisting>
+        <para>Start the map-reduce job:</para>
+        <programlisting>
+<![CDATA[
+			$ hadoop jar $HADOOP_MAPRED_HOME/hadoop-examples.jar wordcount input output
+			13/06/21 20:19:59 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments.
Applications should implement Tool for the same.
+			13/06/21 20:20:00 INFO input.FileInputFormat: Total input paths to process : 1
+			13/06/21 20:20:00 INFO mapred.JobClient: Running job: job_201306212011_0001
+			13/06/21 20:20:01 INFO mapred.JobClient:  map 0% reduce 0%
+			13/06/21 20:20:11 INFO mapred.JobClient:  map 100% reduce 0%
+			13/06/21 20:20:17 INFO mapred.JobClient:  map 100% reduce 33%
+			13/06/21 20:20:18 INFO mapred.JobClient:  map 100% reduce 100%
+			13/06/21 20:20:21 INFO mapred.JobClient: Job complete: job_201306212011_0001
+			13/06/21 20:20:22 INFO mapred.JobClient: Counters: 32
+			13/06/21 20:20:22 INFO mapred.JobClient:   File System Counters
+			13/06/21 20:20:22 INFO mapred.JobClient:     FILE: Number of bytes read=133
+			13/06/21 20:20:22 INFO mapred.JobClient:     FILE: Number of bytes written=766347
+			13/06/21 20:20:22 INFO mapred.JobClient:     FILE: Number of read operations=0
+			13/06/21 20:20:22 INFO mapred.JobClient:     FILE: Number of large read operations=0
+			13/06/21 20:20:22 INFO mapred.JobClient:     FILE: Number of write operations=0
+			13/06/21 20:20:22 INFO mapred.JobClient:     HDFS: Number of bytes read=157
+			13/06/21 20:20:22 INFO mapred.JobClient:     HDFS: Number of bytes written=50
+			13/06/21 20:20:22 INFO mapred.JobClient:     HDFS: Number of read operations=2
+			13/06/21 20:20:22 INFO mapred.JobClient:     HDFS: Number of large read operations=0
+			13/06/21 20:20:22 INFO mapred.JobClient:     HDFS: Number of write operations=3
+			13/06/21 20:20:22 INFO mapred.JobClient:   Job Counters 
+			13/06/21 20:20:22 INFO mapred.JobClient:     Launched map tasks=1
+			13/06/21 20:20:22 INFO mapred.JobClient:     Launched reduce tasks=3
+			13/06/21 20:20:22 INFO mapred.JobClient:     Data-local map tasks=1
+			13/06/21 20:20:22 INFO mapred.JobClient:     Total time spent by all maps in occupied
slots (ms)=10956
+			13/06/21 20:20:22 INFO mapred.JobClient:     Total time spent by all reduces in occupied
slots (ms)=15446
+			13/06/21 20:20:22 INFO mapred.JobClient:     Total time spent by all maps waiting after
reserving slots (ms)=0
+			13/06/21 20:20:22 INFO mapred.JobClient:     Total time spent by all reduces waiting after
reserving slots (ms)=0
+			13/06/21 20:20:22 INFO mapred.JobClient:   Map-Reduce Framework
+			13/06/21 20:20:22 INFO mapred.JobClient:     Map input records=1
+			13/06/21 20:20:22 INFO mapred.JobClient:     Map output records=8
+			13/06/21 20:20:22 INFO mapred.JobClient:     Map output bytes=66
+			13/06/21 20:20:22 INFO mapred.JobClient:     Input split bytes=123
+			13/06/21 20:20:22 INFO mapred.JobClient:     Combine input records=8
+			13/06/21 20:20:22 INFO mapred.JobClient:     Combine output records=8
+			13/06/21 20:20:22 INFO mapred.JobClient:     Reduce input groups=8
+			13/06/21 20:20:22 INFO mapred.JobClient:     Reduce shuffle bytes=109
+			13/06/21 20:20:22 INFO mapred.JobClient:     Reduce input records=8
+			13/06/21 20:20:22 INFO mapred.JobClient:     Reduce output records=8
+			13/06/21 20:20:22 INFO mapred.JobClient:     Spilled Records=16
+			13/06/21 20:20:22 INFO mapred.JobClient:     CPU time spent (ms)=1880
+			13/06/21 20:20:22 INFO mapred.JobClient:     Physical memory (bytes) snapshot=469413888
+			13/06/21 20:20:22 INFO mapred.JobClient:     Virtual memory (bytes) snapshot=5744541696
+			13/06/21 20:20:22 INFO mapred.JobClient:     Total committed heap usage (bytes)=207687680
+]]>
+        </programlisting>
+        <para> And you can finally check the output:</para>
+        <programlisting>
+$ hadoop fs -cat output/part-* | head
+this	1
+to		1
+the		1
+a		1
+count	1
+is		1
+test	1
+words	1
+        </programlisting>            
+    </section>
+
 </section>


Mime
View raw message