gora-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From build...@apache.org
Subject svn commit: r897890 - in /websites/staging/gora/trunk/content: ./ current/tutorial.html
Date Fri, 14 Feb 2014 11:31:01 GMT
Author: buildbot
Date: Fri Feb 14 11:31:00 2014
New Revision: 897890

Log:
Staging update by buildbot for gora

Modified:
    websites/staging/gora/trunk/content/   (props changed)
    websites/staging/gora/trunk/content/current/tutorial.html

Propchange: websites/staging/gora/trunk/content/
------------------------------------------------------------------------------
--- cms:source-revision (original)
+++ cms:source-revision Fri Feb 14 11:31:00 2014
@@ -1 +1 @@
-1567983
+1568242

Modified: websites/staging/gora/trunk/content/current/tutorial.html
==============================================================================
--- websites/staging/gora/trunk/content/current/tutorial.html (original)
+++ websites/staging/gora/trunk/content/current/tutorial.html Fri Feb 14 11:31:00 2014
@@ -157,13 +157,13 @@ know how the values are persisted.</p>
 <p>Gora has a modular architecture. Most of the data stores in Gora, 
 has it's own module, such as gora-hbase, gora-cassandra,
 and gora-sql. In your projects, you need to only include 
-the artifacts from the modules you use. You can consult the <a href="/quickstart.html">quick
start</a>
+the artifacts from the modules you use. You can consult the <a href="/current/quickstart.html">quick
start</a>
 for setting up your project.</p>
 <h2 id="setting-up-gora">Setting up Gora</h2>
 <p>As a first step, we need to download and compile the Gora source code. The source
codes 
 for the tutorial is in the gora-tutorial module. If you have
 already downloaded Gora, that's cool, otherwise, please go
-over the steps at the <a href="/quickstart.html">quickstart</a> guide for
+over the steps at the <a href="/current/quickstart.html">quickstart</a> guide
for
 how to download and compile Gora.</p>
 <p>Now, after the source code for Gora is at hand, let's have a look at the files under
the 
 directory gora-tutorial. </p>
@@ -206,17 +206,16 @@ $ <span class="n">tree</span>
 
 
 <p>Since gora-tutorial is a top level module of Gora, it depends on the directory
-structure imposed by Gora's main build scripts (build.xml and 
-build-common.xml with Ivy and pom.xml for Maven). The Java source code resides in directory

-src/main/java/, avro schemas in src/main/avro/, and data in src/main/resources/.</p>
+structure imposed by Gora's main build scripts (pom.xml for Maven). The Java source code
resides in directory 
+<code>src/main/java/</code>, avro schemas in <code>src/main/avro/</code>,
and data in <code>src/main/resources/</code>.</p>
 <h2 id="setting-up-hbase">Setting up HBase</h2>
-<p>For this tutorial we will be using HBase to 
+<p>For this tutorial we will be using <a href="http://hbase.apache.org">HBase</a>
to 
 store the logs. For those of you not familiar with HBase, it is a NoSQL
 column store with an architecture very similar to Google's BigTable.</p>
 <p>If you don't already have already HBase setup, you can go over the steps at 
 <a href="http://hbase.apache.org/book/quickstart.html">HBase Overview</a>
 documentation. Gora aims to support the most recent HBase versions however if you
-find compatability problems please <a href="../mailing_lists.html">get in touch</a>.
+find compatibility problems please <a href="../mailing_lists.html">get in touch</a>.
 So download an <a href="http://www.apache.org/dyn/closer.cgi/hbase/">HBase release</a>.

 After extracting the file, cd to the hbase-${dist} directory and start the HBase server.
</p>
 <div class="codehilite"><pre>$ <span class="n">bin</span><span
class="o">/</span><span class="n">start</span><span class="o">-</span><span
class="n">hbase</span><span class="p">.</span><span class="n">sh</span>
@@ -230,7 +229,7 @@ After extracting the file, cd to the hba
 
 <h2 id="configuring-gora">Configuring Gora</h2>
 <p>Gora is configured through a file in the classpath named gora.properties. 
-We will be using the following file gora-tutorial/conf/gora.properties</p>
+We will be using the following file <code>gora-tutorial/conf/gora.properties</code></p>
 <div class="codehilite"><pre>  <span class="n">gora</span><span
class="p">.</span><span class="n">datastore</span><span class="p">.</span><span
class="n">default</span><span class="p">=</span><span class="n">org</span><span
class="p">.</span><span class="n">apache</span><span class="p">.</span><span
class="n">gora</span><span class="p">.</span><span class="n">hbase</span><span
class="p">.</span><span class="n">store</span><span class="p">.</span><span
class="n">HBaseStore</span>
   <span class="n">gora</span><span class="p">.</span><span class="n">datastore</span><span
class="p">.</span><span class="n">autocreateschema</span><span class="p">=</span><span
class="n">true</span>
 </pre></div>
@@ -238,11 +237,11 @@ We will be using the following file gora
 
 <p>This file states that the default store will be HBaseStore,
 and schemas(tables) should be automatically created.
-More information for configuring different settings in gora.properties 
-can be found <a href="/gora-conf.html">here</a>.</p>
-<h2 id="modelling-the-data">Modelling the data</h2>
+More information for configuring different settings in <code>gora.properties</code>

+can be found <a href="/current/gora-conf.html">here</a>.</p>
+<h2 id="modeling-the-data">Modeling the data</h2>
 <p>For this tutorial, we will be parsing and storing the logs of a web server. 
-Some example logs are at src/main/resources/access.log.tar.gz, which 
+Some example logs are at <code>src/main/resources/access.log.tar.gz</code>, which

 belongs to the (now shutdown) server at http://www.buldinle.com/. 
 Example logs contain 10,000 lines, between dates 2009/03/10 - 2009/03/15.
 The first thing, we need to do is to extract the logs.</p>
@@ -266,12 +265,12 @@ returned, Referrer, and User Agent.</p>
 <p>Data beans are the main way to hold the data in memory and persist in Gora. Gora

 needs to explicitly keep track of the status of the data in memory, so 
 we use <a href="http://avro.apache.org">Apache Avro</a> for defining the beans.
Using 
-Avro gives us the possibility to explicitly keep track object's persistency state, 
+Avro gives us the possibility to explicitly keep track object's persistent state, 
 and a way to serialize object's data. 
 Defining data beans is a very easy task, but for the exact syntax, please 
 consult to <a href="http://avro.apache.org/docs/current/spec.html">Avro Specification</a>.
 First, we need to define the bean Pageview to hold a
-single URL access in the logs. Let's go over the class at src/main/avro/pageview.json </p>
+single URL access in the logs. Let's go over the class at <code>src/main/avro/pageview.json</code></p>
 <div class="codehilite"><pre> <span class="p">{</span>
   &quot;<span class="n">type</span>&quot;<span class="p">:</span>
&quot;<span class="n">record</span>&quot;<span class="p">,</span>
   &quot;<span class="n">name</span>&quot;<span class="p">:</span>
&quot;<span class="n">Pageview</span>&quot;<span class="p">,</span>
@@ -297,7 +296,7 @@ namespace which is mapped to the package
 are listed in the "fields" element. Each field is given with its type. </p>
 <h2 id="compiling-avro-schemas">Compiling Avro Schemas</h2>
 <p>The next step after defining the data beans is to compile the schemas 
-into Java classes. For that we will use GoraCompiler&gt;. 
+into Java classes. For that we will use the <a href="/current/compiler.html">GoraCompiler</a>.

 Invoking the Gora compiler by (from Gora top level directory)</p>
 <div class="codehilite"><pre>$ <span class="n">bin</span><span
class="o">/</span><span class="n">gora</span> <span class="n">goracompiler</span>
 </pre></div>
@@ -326,9 +325,9 @@ Invoking the Gora compiler by (from Gora
 </pre></div>
 
 
-<p>to compile the Pageview class into gora-tutorial/src/main/java/org/apache/gora/tutorial/log/generated/Pageview.java.

+<p>to compile the Pageview class into <code>gora-tutorial/src/main/java/org/apache/gora/tutorial/log/generated/Pageview.java</code>.

 This will use the default license header which is ASLv2 for licensing the generated data
beans.
-However, the tutorial java classes are already committed, so you do not need to do that now.</p>
+However, the tutorial java classes are already committed and present within SVN, so you do
not need to do that now.</p>
 <p>Gora compiler extends Avro's SpecificCompiler to convert JSON definition 
 into a Java class. Generated classes extend the Persistent interface. 
 Most of the methods of the Persistent interface deal with bookkeeping for 
@@ -387,8 +386,8 @@ mapping format, so that data-store speci
 The mapping files declare how the fields of the classes declared in Avro schemas 
 are serialized and persisted to the data store.</p>
 <h3 id="hbase-mappings">HBase mappings</h3>
-<p>HBase mappings are stored at file named gora-hbase-mappings.xml. 
-For this tutorial we will be using the file gora-tutorial/conf/gora-hbase-mappings.xml.</p>
+<p>HBase mappings are stored at file named <code>gora-hbase-mappings.xml</code>.

+For this tutorial we will be using the file <code>gora-tutorial/conf/gora-hbase-mappings.xml</code>.</p>
 <div class="codehilite"><pre>  <span class="c">&lt;!--  This is gora-sql-mapping.xml</span>
 
 <span class="c">&lt;gora-orm&gt;</span>
@@ -434,11 +433,11 @@ For this tutorial we will be using the f
 </pre></div>
 
 
-<p>Every mapping file starts with the top level element <gora-orm>. 
+<p>Every mapping file starts with the top level element <code><gora-orm></code>.

 Gora HBase mapping files can have two type of child elements, table and 
 class declarations. All of the table and class definitions should be 
 listed at this level.</p>
-<p>table declaration is optional and most of the time, Gora infers the table 
+<p>The table declaration is optional and most of the time, Gora infers the table 
 declaration from the class sub elements. However, some of the HBase 
 specific table configuration such as compression, blockCache, etc can be given here, 
 if Gora is used to auto-create the tables. The exact syntax for the file can be found 
@@ -449,7 +448,7 @@ DataStore API expects to know the class 
 they can be instantiated. The key value pair is declared in the class element.
 The name attribute is the fully qualified name of the class, 
 and the keyClass attribute is the fully qualified class name of the key class.</p>
-<p>Children of the &lt;class&gt; element are &lt;field&gt; 
+<p>Children of the <code>class</code> element are <code>field</code>

 elements. Each field element has a name and family attribute, and 
 an optional qualifier attribute. name attribute contains the name 
 of the field in the persistent class, and family declares the column family 
@@ -458,11 +457,11 @@ as the column qualifier. Note that map a
 families, so the configuration should be list unique column families for each map and 
 array type, and no qualifier should be given. The exact data model is discussed further 
 at the <a href="/current/gora-hbase.html">gora-hbase</a> documentation. </p>
-<h2 id="basic-api-wzxhzdk54">Basic API </title></h2>
+<h2 id="basic-api-wzxhzdk78">Basic API </title></h2>
 <h3 id="parsing-the-logs">Parsing the logs</h3>
 <p>Now that we have the basic setup, we can see Gora API in action. As you can notice
below the API 
 is pretty simple to use. We will be using the class LogManager (which is located at
-gora-tutorial/src/main/java/org/apache/gora/tutorial/log/LogManager.java) for parsing 
+<code>gora-tutorial/src/main/java/org/apache/gora/tutorial/log/LogManager.java</code>)
for parsing 
 and storing the logs, deleting some lines and querying. </p>
 <p>First of all, let us look at the constructor. The only real thing it does is to
call the 
 init() method. init() method constructs the 
@@ -491,7 +490,7 @@ data store, and the value is the actual 
 Avro schema definitions using the Gora compiler.</p>
 <p>Data store objects are created by DataStoreFactory. It is necessary to 
 provide the key and value class. The datastore class is optional, 
-and if not specified it will be read from the configuration (gora.properties).</p>
+and if not specified it will be read from the configuration (<code>gora.properties</code>).</p>
 <p>For this tutorial, we have already defined the avro schema to use and compiled
 our data bean into Pageview class. For keys in the data store, we will be using Longs. 
 The keys will hold the line of the pageview in the data file.</p>
@@ -540,7 +539,7 @@ The keys will hold the line of the pagev
 </pre></div>
 
 
-<p>So to parse and store our logs located at gora-tutorial/src/main/resources/access.log,
we will issue:</p>
+<p>So to parse and store our logs located at <code>gora-tutorial/src/main/resources/access.log</code>,
we will issue:</p>
 <div class="codehilite"><pre>$ <span class="n">bin</span><span
class="o">/</span><span class="n">gora</span> <span class="n">logmanager</span>
<span class="o">-</span><span class="n">parse</span> <span class="n">gora</span><span
class="o">-</span><span class="n">tutorial</span><span class="o">/</span><span
class="n">src</span><span class="o">/</span><span class="n">main</span><span
class="o">/</span><span class="n">resources</span><span class="o">/</span><span
class="n">access</span><span class="p">.</span><span class="nb">log</span>
 </pre></div>
 
@@ -631,7 +630,7 @@ LogManager always closes it's datastore 
 the data store, you can also the flush()
 method which, as expected, flushes the data to the underlying data store. However, the actual
flush 
 semantics can vary by the data store backend. For example, in SQL flush calls commit()
-on the jdbc Connection object, whereas in Hbase, HTable#flush() is called.
+on the jdbc Connection object, whereas in Hb=Base, <code>HTable#flush()</code>
is called.
 Also note that even if you call flush() at the end of all data manipulation operations, 
 you still need to call the close() on the datastore.</p>
 <h2 id="persisted-data-in-hbase">Persisted data in HBase</h2>
@@ -676,7 +675,7 @@ gora-hbase-mapping.xml. Looking at the c
 
 <p>The output shows all the columns matching the first line with key 0. We can see

 the columns common:ip, common:timestamp, common:url, etc. Remember that 
-these are the columns that we have described in the gora-hbase-mapping.xml file.</p>
+these are the columns that we have described in the <code>gora-hbase-mapping.xml</code>
file.</p>
 <p>You can also count the number of entries in the table to make sure that all the
records
 have been stored.</p>
 <div class="codehilite"><pre><span class="n">hbase</span><span
class="p">(</span><span class="n">main</span><span class="p">):</span>010<span
class="p">:</span>0<span class="o">&gt;</span> <span class="n">count</span>
<span class="s">&#39;AccessLog&#39;</span>
@@ -690,9 +689,9 @@ have been stored.</p>
 two methods for fetching objects. First one is to fetch a single object given it's key. The

 second method is to run a query through the data store.</p>
 <p>To fetch objects one by one, we can use one of the overloaded 
-get() methods. 
-The method with signature get(K key) returns the object corresponding to the given key fetching
all the 
-fields. On the other hand get(K key, String[] fields) returns the object corresponding to
the 
+<code>get()</code> methods. 
+The method with signature <code>get(K key)</code> returns the object corresponding
to the given key fetching all the 
+fields. On the other hand <code>get(K key, String[] fields)</code> returns the
object corresponding to the 
 given key, but fetching only the fields given as the second argument.</p>
 <p>When run with the argument -get LogManager class fetches the pageview object 
 from the data store and prints the results.</p>
@@ -723,7 +722,7 @@ from the data store and prints the resul
 <h2 id="querying-objects">Querying objects</h2>
 <p>DataStore API defines a Query interface to query the objects at the data store.

 Each data store implementation can use a specific implementation of the Query interface.
Queries are 
-instantiated by calling DataStore#newQuery(). When the query is run through the datastore,
the results 
+instantiated by calling <code>DataStore#newQuery()</code>. When the query is
run through the datastore, the results 
 are returned via the Result interface. Let's see how we can run a query and display the results
below in the 
 the LogManager class.</p>
 <div class="codehilite"><pre><span class="cm">/** Queries and prints pageview
object that have keys between startKey and endKey*/</span>
@@ -741,10 +740,10 @@ the LogManager class.</p>
 
 
 <p>After constructing a Query, its properties 
-are set via the setter methods. Then calling query.execute() returns
-the Result object.</p>
+are set via the setter methods. Then calling <code>query.execute()</code> returns
+the <code>Result</code> object.</p>
 <p>Result interface allows us to iterate the results one by one by calling the 
-next() method. The getKey() method returns the current key and get()
+<code>next()</code> method. The <code>getKey()</code> method returns
the current key and <code>get()</code>
 returns current persistent object.</p>
 <div class="codehilite"><pre><span class="n">private</span> <span
class="n">void</span> <span class="n">printResult</span><span class="p">(</span><span
class="n">Result</span><span class="o">&lt;</span><span class="n">Long</span><span
class="p">,</span> <span class="n">Pageview</span><span class="o">&gt;</span>
<span class="n">result</span><span class="p">)</span> <span class="n">throws</span>
<span class="n">IOException</span> <span class="p">{</span>
 
@@ -798,9 +797,9 @@ we can use:</p>
 <h2 id="deleting-objects">Deleting objects</h2>
 <p>Just like fetching objects, there are two main methods to delete 
 objects from the data store. The first one is to delete objects one by 
-one using the DataStore#delete(K) method, which takes the key of the object. 
+one using the <code>DataStore#delete(K key)</code> method, which takes the key
of the object. 
 Alternatively we can delete all of the data that matches a given query by 
-calling the DataStore#deleteByQuery(Query) method. By using deleteByQuery, we can 
+calling the <code>DataStore#deleteByQuery(Query query)</code> method. By using
<code>#deleteByQuery</code>, we can 
 do fine-grain deletes, for example deleting just a specific field 
 from several records. 
 Continueing from the LogManager class, the api's for both are given below.</p>
@@ -839,7 +838,7 @@ serialization, Gora extends Avro DatumWr
 stored at HBase earlier. Specifically, we will develop a MapReduce program to 
 calculate the number of daily pageviews for each URL in the site.</p>
 <p>We will be using the LogAnalytics class to analyze the logs, which can
-be found at gora-tutorial/src/main/java/org/apache/gora/tutorial/log/LogAnalytics.java.
+be found at <code>gora-tutorial/src/main/java/org/apache/gora/tutorial/log/LogAnalytics.java</code>.
 For computing the analytics, the mapper takes in pageviews, and outputs tuples of 
 &lt;URL, timestamp&gt; pairs, with 1 as the value. The timestamp represents the day

 in which the pageview occurred, so that the daily pageviews are accumulated. 
@@ -867,19 +866,25 @@ Ofcourse MySQL users should uncomment th
 and give necessary permissions to create tables, etc so that Gora can run properly.</p>
 <h3 id="configuring-gora_1">Configuring Gora</h3>
 <p>We will put the configuration necessary to connect to the database to 
-gora-tutorial/conf/gora.properties.</p>
+<code>gora-tutorial/conf/gora.properties</code>.</p>
 <h4 id="jdbc-properties-for-gora-sql-module-using-hsql">JDBC properties for gora-sql
module using HSQL</h4>
-<p>gora.sqlstore.jdbc.driver=org.hsqldb.jdbcDriver
-gora.sqlstore.jdbc.url=jdbc:hsqldb:hsql://localhost/goratest</p>
+<div class="codehilite"><pre><span class="n">gora</span><span
class="p">.</span><span class="n">sqlstore</span><span class="p">.</span><span
class="n">jdbc</span><span class="p">.</span><span class="n">driver</span><span
class="p">=</span><span class="n">org</span><span class="p">.</span><span
class="n">hsqldb</span><span class="p">.</span><span class="n">jdbcDriver</span>
+<span class="n">gora</span><span class="p">.</span><span class="n">sqlstore</span><span
class="p">.</span><span class="n">jdbc</span><span class="p">.</span><span
class="n">url</span><span class="p">=</span><span class="n">jdbc</span><span
class="p">:</span><span class="n">hsqldb</span><span class="p">:</span><span
class="n">hsql</span><span class="p">:</span><span class="o">//</span><span
class="n">localhost</span><span class="o">/</span><span class="n">goratest</span>
+</pre></div>
+
+
 <h4 id="jdbc-properties-for-gora-sql-module-using-mysql">JDBC properties for gora-sql
module using MySQL</h4>
-<p>gora.sqlstore.jdbc.driver=com.mysql.jdbc.Driver
-gora.sqlstore.jdbc.url=jdbc:mysql://localhost:3306/goratest
-gora.sqlstore.jdbc.user=root
-gora.sqlstore.jdbc.password=      </p>
+<div class="codehilite"><pre><span class="n">gora</span><span
class="p">.</span><span class="n">sqlstore</span><span class="p">.</span><span
class="n">jdbc</span><span class="p">.</span><span class="n">driver</span><span
class="p">=</span><span class="n">com</span><span class="p">.</span><span
class="n">mysql</span><span class="p">.</span><span class="n">jdbc</span><span
class="p">.</span><span class="n">Driver</span>
+<span class="n">gora</span><span class="p">.</span><span class="n">sqlstore</span><span
class="p">.</span><span class="n">jdbc</span><span class="p">.</span><span
class="n">url</span><span class="p">=</span><span class="n">jdbc</span><span
class="p">:</span><span class="n">mysql</span><span class="p">:</span><span
class="o">//</span><span class="n">localhost</span><span class="p">:</span>3306<span
class="o">/</span><span class="n">goratest</span>
+<span class="n">gora</span><span class="p">.</span><span class="n">sqlstore</span><span
class="p">.</span><span class="n">jdbc</span><span class="p">.</span><span
class="n">user</span><span class="p">=</span><span class="n">root</span>
+<span class="n">gora</span><span class="p">.</span><span class="n">sqlstore</span><span
class="p">.</span><span class="n">jdbc</span><span class="p">.</span><span
class="n">password</span><span class="p">=</span>
+</pre></div>
+
+
 <p>As expected the jdbc.driver property is the JDBC driver class,
 and jdbc.url is the JDBC connection URL. Moreover jdbc.user
 and jdbc.password can be specific is needed. More information for these 
-parameters can be found at <a href="/gora-sql.html">gora-sql</a> documentation.
</p>
+parameters can be found at <a href="/current/gora-sql.html">gora-sql</a> documentation.
</p>
 <h3 id="modelling-the-data-data-beans-for-analytics">Modelling the data - Data Beans
for Analytics</h3>
 <p>For web site analytics, we will be using a generic MetricDatum
 data structure. It holds a string metricDimension, a long 
@@ -889,8 +894,8 @@ metric value. For example we might have 
 timestamp=101, metric=12}, representing that there have been 12 pageviews to 
 the URL "/index" for the given time interval 101.</p>
 <p>The avro schema definition for MetricDatum can be found at 
-gora-tutorial/src/main/avro/metricdatum.json, and the compiled source 
-code at gora-tutorial/src/main/java/org/apache/gora/tutorial/log/generated/MetricDatum.java.</p>
+<code>gora-tutorial/src/main/avro/metricdatum.json</code>, and the compiled source

+code at <code>gora-tutorial/src/main/java/org/apache/gora/tutorial/log/generated/MetricDatum.java</code>.</p>
 <div class="codehilite"><pre><span class="p">{</span>
   &quot;<span class="n">type</span>&quot;<span class="p">:</span>
&quot;<span class="n">record</span>&quot;<span class="p">,</span>
   &quot;<span class="n">name</span>&quot;<span class="p">:</span>
&quot;<span class="n">MetricDatum</span>&quot;<span class="p">,</span>
@@ -908,8 +913,8 @@ code at gora-tutorial/src/main/java/org/
 <p>We will be using the SQL backend to store the job output data, just to 
 demonstrate the SQL backend. </p>
 <p>Similar to what we have seen with HBase, gora-sql plugin reads configuration from
the 
-gora-sql-mappings.xml file. 
-Specifically, we will use the gora-tutorial/conf/gora-sql-mappings.xml file.    </p>
+<code>gora-sql-mappings.xml</code> file. 
+Specifically, we will use the <code>gora-tutorial/conf/gora-sql-mappings.xml</code>
file.    </p>
 <div class="codehilite"><pre><span class="nt">&lt;gora-orm&gt;</span>
   ...
   <span class="nt">&lt;class</span> <span class="na">name=</span><span
class="s">&quot;org.apache.gora.tutorial.log.generated.MetricDatum&quot;</span>
<span class="na">keyClass=</span><span class="s">&quot;java.lang.String&quot;</span>
<span class="na">table=</span><span class="s">&quot;Metrics&quot;</span><span
class="nt">&gt;</span>
@@ -943,13 +948,13 @@ However, if the mapper or reducer extend
 you can use the static methods defined in GoraMapper and 
 GoraReducer since they are more convenient. </p>
 <p>For this tutorial we will use Gora as both input and output. As can be seen from
the 
-createJob() function, quoted below, we create the job 
+<code>createJob()</code> function, quoted below, we create the job 
 as normal, and set the input parameters via 
-GoraMapper#initMapperJob(), and GoraReducer#initReducerJob(). 
-GoraMapper#initMapperJob() takes a store and an optional query to fetch the data from. 
+<code>GoraMapper#initMapperJob()</code>, and <code>GoraReducer#initReducerJob()</code>.
</p>
+<p><code>GoraMapper#initMapperJob()</code> takes a store and an optional
query to fetch the data from. 
 When a query is given, only the results of the query is used as the input of the job, if
not all the records are used. 
-The actual Mapper, map output key and value classes are passed to initMapperJob() 
-function as well. GoraReducer#initReducerJob() accepts 
+The actual Mapper, map output key and value classes are passed to <code>initMapperJob()</code>

+function as well. <code>GoraReducer#initReducerJob()</code> accepts 
 the data store to store the job's output as well as the actual reducer class.
 initMapperJob and initReducerJob functions have also overriden methods that take the data
store class 
 rather than data store instances.</p>
@@ -1070,7 +1075,7 @@ if we are using HSQLDB, below command ca
 
 <p>In the connection URL, the same URL that we have provided in gora.properties should
be used. If on the other hand 
 MySQL is used, than we should be able to see the output using the mysql command line utility.
</p>
-<p>The results of the job are stored at the table Metrics, which is defined at the
gora-sql-mapping.xml 
+<p>The results of the job are stored at the table Metrics, which is defined at the
<code>gora-sql-mapping.xml</code> 
 file. Running a select query over this data confirms that the daily pageview metrics for
the web site is indeed stored.
 To see the most popular pages, run:</p>
 <p>&gt; SELECT METRICDIMENSION, TS, METRIC  FROM metrics order by metric desc</p>
@@ -1088,17 +1093,17 @@ To see the most popular pages, run:</p>
 <tr><td>...</td> <td>...</td> <td>...</td></tr>
       </table>
 
-<p>As you can see, the home page (/) for varios days and some other pages are listed.

+<p>As you can see, the home page (/) for various days and some other pages are listed.

 In total 3033 rows are present at the metrics table. </p>
 <h3 id="running-the-job-with-hbase">Running the job with HBase</h3>
-<p>Since HBaseStore is already defined as the default data store at gora.properties
+<p>Since HBaseStore is already defined as the default data store at <code>gora.properties</code>
 we can run the job with HBase as:</p>
 <div class="codehilite"><pre>$ <span class="n">bin</span><span
class="o">/</span><span class="n">gora</span> <span class="n">loganalytics</span>
 </pre></div>
 
 
 <p>The outputs of the job will be saved in the Metrics table, whose layout is defined
at 
-gora-hbase-mapping.xml file. To see the results:</p>
+<code>gora-hbase-mapping.xml</code> file. To see the results:</p>
 <div class="codehilite"><pre><span class="n">hbase</span><span
class="p">(</span><span class="n">main</span><span class="p">):</span>010<span
class="p">:</span>0<span class="o">&gt;</span> <span class="n">scan</span>
<span class="s">&#39;Metrics&#39;</span><span class="p">,</span>
<span class="p">{</span><span class="n">LIMIT</span><span class="p">=</span><span
class="o">&gt;</span>1<span class="p">}</span>
 
 <span class="n">ROW</span>                          <span class="n">COLUMN</span><span
class="o">+</span><span class="n">CELL</span>
@@ -1116,14 +1121,15 @@ gora-hbase-mapping.xml file. To see the 
 <p>Other than this tutorial, there are several places that you can find 
 examples of Gora in action.</p>
 <p>The first place to look at is the examples directories 
-under various Gora modules. All the modules have a &lt;gora-module&gt;/src/examples/
directory 
+under various Gora modules. All the modules have a <code>/src/examples/</code>
directory 
 under which some example classes can be found. Especially, there are some classes that are
used for tests under 
-&lt;gora-core&gt;/src/examples/</p>
+<code>gora-core/src/examples/</code></p>
 <p>Second, various unit tests of Gora modules can be referred to see the API in use.
The unit tests can be found 
-at <gora-module>/src/test/</p>
+at <code>gora-core/src/test/</code>. </p>
 <p>The source code for the projects using Gora can also be checked out as a reference.
<a href="http://nutch.apache.org">Apache Nutch</a> is 
-one of the first class users of Gora; so looking into how Nutch uses Gora is always a good
idea.</p>
-<p>Please feel free to grab our <a href="http://gora.apache.org/images/powered-by-gora.png">poweredBy</a>
sticker and embedded it in anything backed by Apache Gora.</p>
+one of the first class users of Gora; so looking into how Nutch uses Gora is always a good
idea. Gora is however also in use 
+in other Apache projects such as <a href="http://giraph.apache.org">Apache Giraph</a></p>
+<p>Please feel free to grab our <a href="http://gora.apache.org/resources/img/powered-by-gora.png">poweredBy</a>
sticker and embedded it in anything backed by Apache Gora.</p>
 <h2 id="feedback">Feedback</h2>
 <p>At last, thanks for trying out Gora. If you find any bugs or you have suggestions
for improvement, 
 do not hesitate to give feedback on the dev@gora.apache.org <a href="../mailing_lists.html">mailing
list</a>.</p>



Mime
View raw message