tajo-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From hyun...@apache.org
Subject svn commit: r1462715 - /incubator/tajo/site/getting_started.html
Date Sat, 30 Mar 2013 08:09:17 GMT
Author: hyunsik
Date: Sat Mar 30 08:09:16 2013
New Revision: 1462715

URL: http://svn.apache.org/r1462715
Log:
Update 'Getting Started' page by TAJO-17

Modified:
    incubator/tajo/site/getting_started.html

Modified: incubator/tajo/site/getting_started.html
URL: http://svn.apache.org/viewvc/incubator/tajo/site/getting_started.html?rev=1462715&r1=1462714&r2=1462715&view=diff
==============================================================================
--- incubator/tajo/site/getting_started.html (original)
+++ incubator/tajo/site/getting_started.html Sat Mar 30 08:09:16 2013
@@ -1,13 +1,13 @@
 <!DOCTYPE html>
 <!--
- | Generated by Apache Maven Doxia at Mar 21, 2013
+ | Generated by Apache Maven Doxia at Mar 30, 2013
  | Rendered using Apache Maven Fluido Skin 1.3.0
 -->
 <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
   <head>
     <meta charset="UTF-8" />
     <meta name="viewport" content="width=device-width, initial-scale=1.0" />
-    <meta name="Date-Revision-yyyymmdd" content="20130321" />
+    <meta name="Date-Revision-yyyymmdd" content="20130330" />
     <meta http-equiv="Content-Language" content="en" />
     <title>Getting Started</title>
     <link rel="stylesheet" href="./css/apache-maven-fluido-1.3.0.min.css" />
@@ -146,7 +146,7 @@
         
                 
                     
-                  <li id="publishDate" class="pull-right">Last Published: 2013-03-21</li>
<li class="divider pull-right">|</li>
+                  <li id="publishDate" class="pull-right">Last Published: 2013-03-30</li>
<li class="divider pull-right">|</li>
               <li id="projectVersion" class="pull-right">Version: 0.2.0-SNAPSHOT</li>
             
                             </ul>
@@ -256,15 +256,20 @@
                 
         <div id="bodyColumn"  class="span9" >
                                   
-            <!-- Licensed to the Apache Software Foundation (ASF) under one or more --><!--
contributor license agreements.  See the NOTICE file distributed with --><!-- this work
for additional information regarding copyright ownership. --><!-- The ASF licenses this
file to You under the Apache License, Version 2.0 --><!-- (the "License"); you may not
use this file except in compliance with --><!-- the License.  You may obtain a copy
of the License at --><!--  --><!-- http://www.apache.org/licenses/LICENSE-2.0
--><!--  --><!-- Unless required by applicable law or agreed to in writing, software
--><!-- distributed under the License is distributed on an "AS IS" BASIS, --><!--
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. --><!-- See
the License for the specific language governing permissions and --><!-- limitations
under the License. --><!--  --><div class="section"><h2>Prerequisites<a
name="Prerequisites"></a></h2><ul><li>Hadoop 2.0.3-alpha</li><li
 >Java 1.6</li></ul></div><div class="section"><h2>Build
Tajo from Source Code<a name="Build_Tajo_from_Source_Code"></a></h2><p>Download
the source code and build Tajo as follows:</p><div><pre>$ git clone http://git-wip-us.apache.org/repos/asf/incubator-tajo.git
+            <!-- Licensed to the Apache Software Foundation (ASF) under one --><!--
or more contributor license agreements.  See the NOTICE file --><!-- distributed with
this work for additional information --><!-- regarding copyright ownership.  The ASF
licenses this file --><!-- to you under the Apache License, Version 2.0 (the --><!--
"License"); you may not use this file except in compliance --><!-- with the License.
 You may obtain a copy of the License at --><!--  --><!-- http://www.apache.org/licenses/LICENSE-2.0
--><!--  --><!-- Unless required by applicable law or agreed to in writing, software
--><!-- distributed under the License is distributed on an "AS IS" BASIS, --><!--
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. --><!-- See
the License for the specific language governing permissions and --><!-- limitations
under the License. --><div class="section"><h2>Prerequisites<a name="Prerequisites"></a></h2><ul><li>Hadoop
2.0.3-alpha</li><li
 >Java 1.6</li></ul></div><div class="section"><h2>Build
Tajo from Source Code<a name="Build_Tajo_from_Source_Code"></a></h2><p>Download
the source code and build Tajo as follows:</p><div><pre>$ git clone http://git-wip-us.apache.org/repos/asf/incubator-tajo.git
tajo
 $ cd tajo
-$ mvn package -DskipTests -Ddisk -Ptar
-$ ls tajo-dist/target/tajo-x.y.z.tar.gz</pre></div><p>If you want to know
the build instruction in more detail, please refer to <a href="./build.html">Build Instruction</a>.</p></div><div
class="section"><h2>Unpack tarball<a name="Unpack_tarball"></a></h2><p>You
should unpack the tarball (refer to build instruction).</p><div><pre>$ tar
xzvf tajo-0.2.0-SNAPSHOT.tar.gz</pre></div><p>This will result in the creation
of subdirectory named tajo-x.y.z-SNAPSHOT. You MUST copy this directory into the same directory
on all yarn cluster nodes.</p></div><div class="section"><h2>Configuration<a
name="Configuration"></a></h2><p>First of all, you need to set the environment
variables for your Hadoop cluster and Tajo.</p><div><pre>export JAVA_HOME=/usr/lib/jvm/openjdk-1.7.x
+$ mvn package -DskipTests -Pdist -Dtar
+$ ls tajo-dist/target/tajo-x.y.z.tar.gz</pre></div><p>If you meet some
errors or you want to know the build instruction in more detail, please read <a href="./build.html">Build
Instruction</a>.</p></div><div class="section"><h2>Unpack tarball<a
name="Unpack_tarball"></a></h2><p>You should unpack the tarball (refer
to build instruction).</p><div><pre>$ tar xzvf tajo-0.2.0-SNAPSHOT.tar.gz</pre></div><p>This
will result in the creation of subdirectory named tajo-x.y.z-SNAPSHOT. You MUST copy this
directory into the same directory on all yarn cluster nodes.</p></div><div
class="section"><h2>Configuration<a name="Configuration"></a></h2><p>First
of all, you need to set the environment variables for your Hadoop cluster and Tajo.</p><div><pre>export
JAVA_HOME=/usr/lib/jvm/openjdk-1.6.x
 export HADOOP_HOME=/usr/local/hadoop-2.0.x
 export YARN_HOME=/usr/local/hadoop-2.0.x
-export TAJO_HOME=&lt;tajo-install-dir&gt;</pre></div><p>Tajo requires
an auxiliary service called PullServer for data repartitioning. So, we must configure for
PullServer in $<a name="HADOOP_HOME">HADOOP_HOME</a>/etc/hadoop/yarn-site.xml.</p><div><pre>&lt;property&gt;
+export TAJO_HOME=&lt;tajo-install-dir&gt;</pre></div><p>Tajo requires
an auxiliary service called PullServer for data repartitioning. For this, you must add or
modify the following configuration parameters in $<a name="HADOOP_HOME">HADOOP_HOME</a>/etc/hadoop/yarn-site.xml.</p><div><pre>&lt;property&gt;
   &lt;name&gt;yarn.nodemanager.aux-services&lt;/name&gt;
-  &lt;value&gt;tajo.pullserver&lt;/value&gt;
+  &lt;value&gt;mapreduce.shuffle,tajo.pullserver&lt;/value&gt;
+&lt;/property&gt;
+
+&lt;property&gt;
+  &lt;name&gt;yarn.nodemanager.aux-services.mapreduce.shuffle.class&lt;/name&gt;
+  &lt;value&gt;org.apache.hadoop.mapred.ShuffleHandler&lt;/value&gt;
 &lt;/property&gt;
 
 &lt;property&gt;
@@ -275,10 +280,10 @@ export TAJO_HOME=&lt;tajo-install-dir&gt
 &lt;property&gt;
   &lt;name&gt;tajo.task.localdir&lt;/name&gt;
   &lt;value&gt;/tmp/tajo-localdir&lt;/value&gt;
-&lt;/property&gt;</pre></div><p>Likewise, you should copy some
jar files to the hadoop library dir.</p><div><pre>$ cp $TAJO_HOME/tajo-common-x.y.z.jar
$HADOOP_HOME/share/yarn/lib
+&lt;/property&gt;</pre></div><p>For the auxiliary, you should copy
some jar files to the Hadoop Yarn library dir.</p><div><pre>$ cp $TAJO_HOME/tajo-common-x.y.z.jar
$HADOOP_HOME/share/yarn/lib
 $ cp $TAJO_HOME/tajo-catalog-common-x.y.z.jar $HADOOP_HOME/share/yarn/lib
 $ cp $TAJO_HOME/tajo-core-pullserver-x.y.z.jar $HADOOP_HOME/share/yarn/lib
-$ cp $TAJO_HOME/tajo-core-storage-x.y.z.jar $HADOOP_HOME/share/yarn/lib</pre></div><p>Copy
$<a name="TAJO_HOME">TAJO_HOME</a>/conf/tajo-site.xml.templete to tajo-site.xml.
Then, add the following configs to your tajo-site.xml. Change <i>hostname</i>
and <i>port</i> to your namenode address.</p><div><pre>  &lt;property&gt;
+$ cp $TAJO_HOME/tajo-core-storage-x.y.z.jar $HADOOP_HOME/share/yarn/lib</pre></div><p>Please
copy $<a name="TAJO_HOME">TAJO_HOME</a>/conf/tajo-site.xml.template to tajo-site.xml.
You must add the following configs to your tajo-site.xml and then change <i>hostname</i>
and <i>port</i> to your namenode address.</p><div><pre>  &lt;property&gt;
     &lt;name&gt;tajo.rootdir&lt;/name&gt;
     &lt;value&gt;hdfs://hostname:port/tajo&lt;/value&gt;
   &lt;/property&gt;
@@ -286,16 +291,16 @@ $ cp $TAJO_HOME/tajo-core-storage-x.y.z.
   &lt;property&gt;
     &lt;name&gt;tajo.cluster.distributed&lt;/name&gt;
     &lt;value&gt;true&lt;/value&gt;
-  &lt;/property&gt;</pre></div><p>If you want know configuration
in more detail, refer to <a href="./configuration.html">Configuration Guide</a>.</p></div><div
class="section"><h2>Running Tajo<a name="Running_Tajo"></a></h2><p>Before
launching the tajo, you should create the tajo root dir and set the permission as follows:</p><div><pre>$
$HADOOP_HOME/bin/hadoop fs -mkdir       /tajo
+  &lt;/property&gt;</pre></div><p>If you want know configuration
in more detail, read <a href="./configuration.html">Configuration Guide</a>.</p></div><div
class="section"><h2>Running Tajo<a name="Running_Tajo"></a></h2><p>Before
launching the tajo, you should create the tajo root dir and set the permission as follows:</p><div><pre>$
$HADOOP_HOME/bin/hadoop fs -mkdir       /tajo
 $ $HADOOP_HOME/bin/hadoop fs -chmod g+w   /tajo</pre></div><p>To launch
the tajo master, execute start-tajo.sh.</p><div><pre>$ $TAJO_HOME/bin/start-tajo.sh</pre></div><p>After
then, you can use tajo-cli to access the command line interface of Tajo.</p><div><pre>$
$TAJO_HOME/bin/tajo cli</pre></div></div><div class="section"><h2>Query
Execution<a name="Query_Execution"></a></h2><p>First of all, we need
to prepare some data for query execution.</p><div><pre>$ mkdir /home/x/table1
 $ cd /home/x/table1
-$ cat &gt;&gt; table1
+$ cat &gt; table1
 1|abc|1.1|a
 2|def|2.3|b
 3|ghi|3.4|c
 4|jkl|4.5|d
 5|mno|5.6|e
-&lt;EOF&gt;</pre></div><p>This schema of this table is (int, string,
float, string).</p><div><pre>$ $TAJO_HOME/bin/tajo cli
+&lt;CTRL + D&gt;</pre></div><p>This schema of this table is (int,
string, float, string).</p><div><pre>$ $TAJO_HOME/bin/tajo cli
 
 tajo&gt; create external table table1 (id int, name string, score float, type string)
using csv with ('csvfile.delimiter'='|') location 'file:/home/x/table1'</pre></div><p>In
order to load an external table, we need to use 'create external table' statement. In the
location clause, you should use the absolute path with an appropriate scheme. If the table
resides in HDFS, we should use 'hdfs' instead of 'file'.</p><p>If you want to
know DDL statements in more detail, please see <a href="./query_language.html">Query
Language</a>.</p><div><pre>tajo&gt; /t
 table1</pre></div><p>'/t' command shows the list of tables.</p><div><pre>tajo&gt;
/d table1



Mime
View raw message