gora-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From build...@apache.org
Subject svn commit: r923977 - in /websites/staging/gora/trunk/content: ./ current/index.html
Date Mon, 29 Sep 2014 01:07:57 GMT
Author: buildbot
Date: Mon Sep 29 01:07:56 2014
New Revision: 923977

Log:
Staging update by buildbot for gora

Modified:
    websites/staging/gora/trunk/content/   (props changed)
    websites/staging/gora/trunk/content/current/index.html

Propchange: websites/staging/gora/trunk/content/
------------------------------------------------------------------------------
--- cms:source-revision (original)
+++ cms:source-revision Mon Sep 29 01:07:56 2014
@@ -1 +1 @@
-1628110
+1628111

Modified: websites/staging/gora/trunk/content/current/index.html
==============================================================================
--- websites/staging/gora/trunk/content/current/index.html (original)
+++ websites/staging/gora/trunk/content/current/index.html Mon Sep 29 01:07:56 2014
@@ -276,25 +276,31 @@ detected.</p>
 <h4 id="building-goraci">Building GoraCI</h4>
 <p>As GoraCI is packaged with the Gora master branch source it is automatically 
 built every time you execute</p>
-<p><code>mvn install</code></p>
+<div class="codehilite"><pre><span class="n">mvn</span> <span
class="n">install</span>
+</pre></div>
+
+
 <p>The maven pom file has some profiles that attempt to make it easier to run
 GoraCI against different Gora backends by copying the jars you need into <code>lib</code>.
 Before packaging its important to edit <code>gora.properties</code> and set it
correctly
 for your datastore.  To run against Accumulo do the following.</p>
-<p><code>
-  vim src/main/resources/gora.properties //set Accumulo properties</p>
-<p>mvn package -Paccumulo-1.4
-</code></p>
+<div class="codehilite"><pre><span class="n">vim</span> <span
class="n">src</span><span class="o">/</span><span class="n">main</span><span
class="o">/</span><span class="n">resources</span><span class="o">/</span><span
class="n">gora</span><span class="p">.</span><span class="k">properties</span>
<span class="o">//</span><span class="n">set</span> <span class="n">Accumulo</span>
<span class="k">properties</span>
+<span class="n">mvn</span> <span class="n">package</span> <span
class="o">-</span><span class="n">Paccumulo</span><span class="o">-</span>1<span
class="p">.</span>4
+</pre></div>
+
+
 <p>To run against HBase, do the following.</p>
-<p><code>
-  vim src/main/resources/gora.properties //set HBase properties</p>
-<p>mvn package -Phbase-0.92
-</code></p>
+<div class="codehilite"><pre><span class="n">vim</span> <span
class="n">src</span><span class="o">/</span><span class="n">main</span><span
class="o">/</span><span class="n">resources</span><span class="o">/</span><span
class="n">gora</span><span class="p">.</span><span class="k">properties</span>
<span class="o">//</span><span class="n">set</span> <span class="n">HBase</span>
<span class="k">properties</span>
+<span class="n">mvn</span> <span class="n">package</span> <span
class="o">-</span><span class="n">Phbase</span><span class="o">-</span>0<span
class="p">.</span>92
+</pre></div>
+
+
 <p>To run against Cassandra, do the following.</p>
-<p><code>
-  vim src/main/resources/gora.properties //set Cassandra properties</p>
-<p>mvn package -Pcassandra-1.1.2
-</code></p>
+<div class="codehilite"><pre><span class="n">vim</span> <span
class="n">src</span><span class="o">/</span><span class="n">main</span><span
class="o">/</span><span class="n">resources</span><span class="o">/</span><span
class="n">gora</span><span class="p">.</span><span class="k">properties</span>
<span class="o">//</span><span class="n">set</span> <span class="n">Cassandra</span>
<span class="k">properties</span>
+<span class="n">mvn</span> <span class="n">package</span> <span
class="o">-</span><span class="n">Pcassandra</span><span class="o">-</span>1<span
class="p">.</span>1<span class="p">.</span>2
+</pre></div>
+
+
 <p>For other datastores mentioned in <code>gora.properties</code>, you
will need to copy the
 appropriate deps into <code>lib</code>.  Feel free to update the pom with other
profiles, <a href="https://issues.apache.org/jira/browse/GORA/">open
 a ticket</a> or just <a href="https://github.com/apache/gora/">send us a pull
request</a>.</p>
@@ -316,10 +322,11 @@ a ticket</a> or just <a href="https://gi
 <p><a href="https://github.com/apache/gora/blob/master/gora-goraci/goraci.sh">goraci.sh</a>
is a helper script that you can use to run the above programs.  It
 assumes all needed jars are in the <code>lib</code> dir.  It does not need the
package name.
 You can just run <code>goraci.sh Generator</code>, below is an example.</p>
-<p><code>
-  $ ./goraci.sh Generator</p>
-<p>Usage : Generator <num mappers> <num nodes>
-</code></p>
+<div class="codehilite"><pre>$ <span class="o">./</span><span
class="n">goraci</span><span class="p">.</span><span class="n">sh</span>
<span class="n">Generator</span>
+<span class="n">Usage</span> <span class="p">:</span> <span class="n">Generator</span>
<span class="o">&lt;</span><span class="n">num</span> <span
class="n">mappers</span><span class="o">&gt;</span> <span class="o">&lt;</span><span
class="n">num</span> <span class="n">nodes</span><span class="o">&gt;</span>
+</pre></div>
+
+
 <p>For Gora to work, it needs a <code>gora.properties</code> file on the
classpath and a
 <code>gora-$datastore-mapping.xml</code> mapping file on the classpath, the contents
of both are datastore specific,
 more details can be found here [2]. You can edit the ones in src/main/resources
@@ -334,35 +341,38 @@ jackson-core-asl-1.4.2.jar and jackson-m
 <h4 id="goraci-and-hbase">GoraCI and HBase</h4>
 <p>To improve performance running read jobs such as the Verify step, enable
 scanner caching on the command line.  For example:</p>
-<p><code>
-    $ ./gorachi.sh Verify -Dhbase.client.scanner.caching=1000 \
-         -Dmapred.map.tasks.speculative.execution=false verify_dir 1000
-</code></p>
+<div class="codehilite"><pre>$ <span class="o">./</span><span
class="n">gorachi</span><span class="p">.</span><span class="n">sh</span>
<span class="n">Verify</span> <span class="o">-</span><span class="n">Dhbase</span><span
class="p">.</span><span class="n">client</span><span class="p">.</span><span
class="n">scanner</span><span class="p">.</span><span class="n">caching</span><span
class="p">=</span>1000 <span class="o">\</span>
+   <span class="o">-</span><span class="n">Dmapred</span><span
class="p">.</span><span class="n">map</span><span class="p">.</span><span
class="n">tasks</span><span class="p">.</span><span class="n">speculative</span><span
class="p">.</span><span class="n">execution</span><span class="p">=</span><span
class="n">false</span> <span class="n">verify_dir</span> 1000
+</pre></div>
+
+
 <p>Dependent on how you have your Hadoop and HBase setup deployed, you may need to
 change the <code>gorachi.sh</code> script around some.  Here is one suggestion
that may help
 in the case where your Hadoop and HBase configuration are other than under the
 Hadoop and HBase home directories.</p>
-<p><code>
-  diff --git a/org.apache.gora.goraci.sh b/org.apache.gora.goraci.sh
-  index db1562a..31c3c94 100755
-  --- a/org.apache.gora.goraci.sh
-  +++ b/org.apache.gora.goraci.sh
-  @@ -95,6 +95,4 @@ done
-   #run it
-   export HADOOP_CLASSPATH="$CLASSPATH"
-   LIBJARS=<code>echo $HADOOP_CLASSPATH | tr : ,</code>
-  -hadoop jar "$GORACI_HOME/lib/org.apache.gora.goraci-0.0.1-SNAPSHOT.jar" $CLASS -libjars
"$LIBJARS" "$@"
-  -
-  -
-  +CLASSPATH="${HBASE_CONF_DIR}" hadoop --config "${HADOOP_CONF_DIR} jar "$GORACI_HOME/lib/org.apache.gora.goraci-0.0.1-SNAPSHOT.jar"
$CLASS -files "${HBASE_CONF_DIR}/hbase-site.xml" -libjars "$LIBJARS" "$@"
-</code></p>
+<div class="codehilite"><pre><span class="gh">diff --git a/org.apache.gora.goraci.sh
b/org.apache.gora.goraci.sh</span>
+<span class="gh">index db1562a..31c3c94 100755</span>
+<span class="gd">--- a/org.apache.gora.goraci.sh</span>
+<span class="gi">+++ b/org.apache.gora.goraci.sh</span>
+<span class="gu">@@ -95,6 +95,4 @@ done</span>
+ #run it
+ export HADOOP_CLASSPATH=&quot;$CLASSPATH&quot;
+ LIBJARS=`echo $HADOOP_CLASSPATH | tr : ,`
+ -hadoop jar &quot;$GORACI_HOME/lib/org.apache.gora.goraci-0.0.1-SNAPSHOT.jar&quot;
$CLASS -libjars &quot;$LIBJARS&quot; &quot;$@&quot;
+ -
+ -
+ +CLASSPATH=&quot;${HBASE_CONF_DIR}&quot; hadoop --config &quot;${HADOOP_CONF_DIR}
jar &quot;$GORACI_HOME/lib/org.apache.gora.goraci-0.0.1-SNAPSHOT.jar&quot; $CLASS
-files &quot;${HBASE_CONF_DIR}/hbase-site.xml&quot; -libjars &quot;$LIBJARS&quot;
&quot;$@&quot;
+</pre></div>
+
+
 <p>You will need to define <code>HBASE_CONF_DIR</code> and </code>HADOOP_CONF_DIR</code>
before you run your
 <strong>goraci</strong> jobs.  For example:</p>
-<p><code>
-  $ export HADOOP_CONF_DIR=/home/you/hadoop-conf</p>
-<p>$ export HBASE_CONF_DIR=/home/you/hbase-conf</p>
-<p>$ PATH=/home/you/hadoop-1.0.2/bin:$PATH ./goraci.sh Generator 1000 1000000
-</code></p>
+<div class="codehilite"><pre>$ <span class="n">export</span> <span
class="n">HADOOP_CONF_DIR</span><span class="p">=</span><span class="o">/</span><span
class="n">home</span><span class="o">/</span><span class="n">you</span><span
class="o">/</span><span class="n">hadoop</span><span class="o">-</span><span
class="n">conf</span>
+$ <span class="n">export</span> <span class="n">HBASE_CONF_DIR</span><span
class="p">=</span><span class="o">/</span><span class="n">home</span><span
class="o">/</span><span class="n">you</span><span class="o">/</span><span
class="n">hbase</span><span class="o">-</span><span class="n">conf</span>
+$ <span class="n">PATH</span><span class="p">=</span><span class="o">/</span><span
class="n">home</span><span class="o">/</span><span class="n">you</span><span
class="o">/</span><span class="n">hadoop</span><span class="o">-</span>1<span
class="p">.</span>0<span class="p">.</span>2<span class="o">/</span><span
class="n">bin</span><span class="p">:</span>$<span class="n">PATH</span>
<span class="o">./</span><span class="n">goraci</span><span class="p">.</span><span
class="n">sh</span> <span class="n">Generator</span> 1000 1000000
+</pre></div>
+
+
 <h4 id="concurrency">Concurrency</h4>
 <p>Its possible to run verification at the same time as generation.  To do this
 supply the -c option to Generator and Verify.  This will cause Genertor to
@@ -385,39 +395,42 @@ are useful for assesing performance.</p>
 <p>Below shows running a test of the test.  Ingest one linked list, deleted a node
 in it, ensure the verifaction map reduce job notices that the node is missing.
 Not all output is shown, just the important parts.</p>
-<p><code>
-  $ ./org.apache.gora.goraci.sh Generator  1 25000000</p>
-<p>$ ./org.apache.gora.goraci.sh Print -s 2000000000000000 -l 1</p>
-<p>2000001f65dbd238:30350f9ae6f6e8f7:000004265852:ef09f9dd-75b1-4c16-9f14-0fa84f3029b6</p>
-<p>$ ./org.apache.gora.goraci.sh Print -s 30350f9ae6f6e8f7 -l 1</p>
-<p>30350f9ae6f6e8f7:4867fe03de6ea6c8:000003265852:ef09f9dd-75b1-4c16-9f14-0fa84f3029b6</p>
-<p>$ ./org.apache.gora.goraci.sh Delete 30350f9ae6f6e8f7</p>
-<p>Delete returned true</p>
-<p>$ ./org.apache.gora.goraci.sh Verify gci_verify_1 2 </p>
-<p>11/12/20 17:12:31 INFO mapred.JobClient:   org.apache.gora.goraci.Verify$Counts</p>
-<p>11/12/20 17:12:31 INFO mapred.JobClient:     UNDEFINED=1</p>
-<p>11/12/20 17:12:31 INFO mapred.JobClient:     REFERENCED=24999998</p>
-<p>11/12/20 17:12:31 INFO mapred.JobClient:     UNREFERENCED=1</p>
-<p>$ hadoop fs -cat gci_verify_1/part* 30350f9ae6f6e8f7 2000001f65dbd238
-</code></p>
+<div class="codehilite"><pre>$ <span class="o">./</span><span
class="n">goraci</span><span class="p">.</span><span class="n">sh</span>
<span class="n">Generator</span>  1 25000000
+$ <span class="o">./</span><span class="n">goraci</span><span
class="p">.</span><span class="n">sh</span> <span class="n">Print</span>
<span class="o">-</span><span class="n">s</span> 2000000000000000
<span class="o">-</span><span class="n">l</span> 1
+  2000001<span class="n">f65dbd238</span><span class="p">:</span>30350<span
class="n">f9ae6f6e8f7</span><span class="p">:</span>000004265852<span
class="p">:</span><span class="n">ef09f9dd</span><span class="o">-</span>75<span
class="n">b1</span><span class="o">-</span>4<span class="n">c16</span><span
class="o">-</span>9<span class="n">f14</span><span class="o">-</span>0<span
class="n">fa84f3029b6</span>
+$ <span class="o">./</span><span class="n">goraci</span><span
class="p">.</span><span class="n">sh</span> <span class="n">Print</span>
<span class="o">-</span><span class="n">s</span> 30350<span class="n">f9ae6f6e8f7</span>
<span class="o">-</span><span class="n">l</span> 1
+  30350<span class="n">f9ae6f6e8f7</span><span class="p">:</span>4867<span
class="n">fe03de6ea6c8</span><span class="p">:</span>000003265852<span
class="p">:</span><span class="n">ef09f9dd</span><span class="o">-</span>75<span
class="n">b1</span><span class="o">-</span>4<span class="n">c16</span><span
class="o">-</span>9<span class="n">f14</span><span class="o">-</span>0<span
class="n">fa84f3029b6</span>
+$ <span class="o">./</span><span class="n">goraci</span><span
class="p">.</span><span class="n">sh</span> <span class="n">Delete</span>
30350<span class="n">f9ae6f6e8f7</span>
+  <span class="n">Delete</span> <span class="n">returned</span> <span
class="n">true</span>
+$ <span class="o">./</span><span class="n">goraci</span><span
class="p">.</span><span class="n">sh</span> <span class="n">Verify</span>
<span class="n">gci_verify_1</span> 2 
+  11<span class="o">/</span>12<span class="o">/</span>20 17<span
class="p">:</span>12<span class="p">:</span>31 <span class="n">INFO</span>
<span class="n">mapred</span><span class="p">.</span><span class="n">JobClient</span><span
class="p">:</span>   <span class="n">org</span><span class="p">.</span><span
class="n">apache</span><span class="p">.</span><span class="n">gora</span><span
class="p">.</span><span class="n">goraci</span><span class="p">.</span><span
class="n">Verify</span>$<span class="n">Counts</span>
+  11<span class="o">/</span>12<span class="o">/</span>20 17<span
class="p">:</span>12<span class="p">:</span>31 <span class="n">INFO</span>
<span class="n">mapred</span><span class="p">.</span><span class="n">JobClient</span><span
class="p">:</span>     <span class="n">UNDEFINED</span><span class="p">=</span>1
+  11<span class="o">/</span>12<span class="o">/</span>20 17<span
class="p">:</span>12<span class="p">:</span>31 <span class="n">INFO</span>
<span class="n">mapred</span><span class="p">.</span><span class="n">JobClient</span><span
class="p">:</span>     <span class="n">REFERENCED</span><span class="p">=</span>24999998
+  11<span class="o">/</span>12<span class="o">/</span>20 17<span
class="p">:</span>12<span class="p">:</span>31 <span class="n">INFO</span>
<span class="n">mapred</span><span class="p">.</span><span class="n">JobClient</span><span
class="p">:</span>     <span class="n">UNREFERENCED</span><span class="p">=</span>1
+$ <span class="n">hadoop</span> <span class="n">fs</span> <span
class="o">-</span><span class="nb">cat</span> <span class="n">gci_verify_1</span><span
class="o">/</span><span class="n">part</span><span class="o">\*</span>
30350<span class="n">f9ae6f6e8f7</span>   2000001<span class="n">f65dbd238</span>
+</pre></div>
+
+
 <p>The map reduce job found the one undefined node and gave the node that
 referenced it.</p>
 <p>Below are some timing statistics for running Goraci on a 10 node cluster. </p>
-<p><code>
-  Store           | Task                   | Time    | Undef  | Unref | Ref      <br />
-  ----------------+------------------------+---------+--------+-------+------------
-  accumulo-1.4.0  | Generator 10 100000000 | 40m 16s |    N/A |   N/A |        N/A   <br
/>
-  accumulo-1.4.0  | Verify /tmp/goraci1 40 |  6m  7s |      0 |     0 | 1000000000<br
/>
-  hbase-0.92.1    | Generator 10 100000000 |  2h 44m |    N/A |   N/A |        N/A   <br
/>
-  hbase-0.92.1    | Verify /tmp/goraci2 40 |  6m 34s |      0 |     0 | 1000000000
-</code></p>
+<div class="codehilite"><pre>Store           | Task                   | Time
   | Undef  | Unref | Ref        
+----------------+------------------------+---------+--------+-------+------------
+accumulo-1.4.0  | Generator 10 100000000 | 40m 16s |    N/A |   N/A |        N/A     
+accumulo-1.4.0  | Verify /tmp/goraci1 40 |  6m  7s |      0 |     0 | 1000000000  
+hbase-0.92.1    | Generator 10 100000000 |  2h 44m |    N/A |   N/A |        N/A     
+hbase-0.92.1    | Verify /tmp/goraci2 40 |  6m 34s |      0 |     0 | 1000000000
+</pre></div>
+
+
 <p>HBase and Accumulo are configured differently out-of-the-box.  We used the Accumulo

 3G, native configuration examples in the <a href="https://github.com/apache/gora/tree/master/gora-goraci/src/main/resources">conf/examples</a>
directory.</p>
 <p>To provide a comparable memory footprint, we increased the HBase jvm to "-Xmx4000m",

 and turned on compression for the ci table:</p>
-<p><code>
-create 'ci', {NAME=&gt;'meta', COMPRESSION=&gt;'GZ'}
-</code></p>
+<div class="codehilite"><pre><span class="n">create</span> <span
class="s">&#39;ci&#39;</span><span class="p">,</span> <span
class="p">{</span><span class="n">NAME</span><span class="p">=</span><span
class="o">&gt;</span><span class="s">&#39;meta&#39;</span><span
class="p">,</span> <span class="n">COMPRESSION</span><span class="p">=</span><span
class="o">&gt;</span><span class="s">&#39;GZ&#39;</span><span
class="p">}</span>
+</pre></div>
+
+
 <p>We also turned down the replication of write-ahead logs to be comparable to Accumulo:</p>
 <div class="codehilite"><pre><span class="nt">&lt;property&gt;</span>
   <span class="nt">&lt;name&gt;</span>hbase.regionserver.hlog.replication<span
class="nt">&lt;/name&gt;</span>



Mime
View raw message