mahout-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From build...@apache.org
Subject svn commit: r946372 - in /websites/staging/mahout/trunk/content: ./ users/classification/mlp.html
Date Sun, 05 Apr 2015 04:49:46 GMT
Author: buildbot
Date: Sun Apr  5 04:49:46 2015
New Revision: 946372

Log:
Staging update by buildbot for mahout

Added:
    websites/staging/mahout/trunk/content/users/classification/mlp.html
Modified:
    websites/staging/mahout/trunk/content/   (props changed)

Propchange: websites/staging/mahout/trunk/content/
------------------------------------------------------------------------------
--- cms:source-revision (original)
+++ cms:source-revision Sun Apr  5 04:49:46 2015
@@ -1 +1 @@
-1671360
+1671372

Added: websites/staging/mahout/trunk/content/users/classification/mlp.html
==============================================================================
--- websites/staging/mahout/trunk/content/users/classification/mlp.html (added)
+++ websites/staging/mahout/trunk/content/users/classification/mlp.html Sun Apr  5 04:49:46
2015
@@ -0,0 +1,483 @@
+<!DOCTYPE html>
+<!--
+
+    Licensed to the Apache Software Foundation (ASF) under one or more
+    contributor license agreements.  See the NOTICE file distributed with
+    this work for additional information regarding copyright ownership.
+    The ASF licenses this file to You under the Apache License, Version 2.0
+    (the "License"); you may not use this file except in compliance with
+    the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+    Unless required by applicable law or agreed to in writing, software
+    distributed under the License is distributed on an "AS IS" BASIS,
+    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+    See the License for the specific language governing permissions and
+    limitations under the License.
+-->
+
+<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"><head><meta
http-equiv="Content-Type" content="text/html; charset=UTF-8">
+  <title>Apache Mahout: Scalable machine learning and data mining</title>
+  <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
+  <meta name="Distribution" content="Global">
+  <meta name="Robots" content="index,follow">
+  <meta name="keywords" content="apache, apache hadoop, apache lucene,
+        business data mining, cluster analysis,
+        collaborative filtering, data extraction, data filtering, data framework, data integration,
+        data matching, data mining, data mining algorithms, data mining analysis, data mining
data,
+        data mining introduction, data mining software,
+        data mining techniques, data representation, data set, datamining,
+        feature extraction, fuzzy k means, genetic algorithm, hadoop,
+        hierarchical clustering, high dimensional, introduction to data mining, kmeans,
+        knowledge discovery, learning approach, learning approaches, learning methods,
+        learning techniques, lucene, machine learning, machine translation, mahout apache,
+        mahout taste, map reduce hadoop, mining data, mining methods, naive bayes,
+        natural language processing,
+        supervised, text mining, time series data, unsupervised, web data mining">
+  <link rel="shortcut icon" type="image/x-icon" href="http://mahout.apache.org/images/favicon.ico">
+  <script type="text/javascript" src="/js/prototype.js"></script>
+  <script type="text/javascript" src="/js/effects.js"></script>
+  <script type="text/javascript" src="/js/search.js"></script>
+  <script type="text/javascript" src="/js/slides.js"></script>
+
+  <link href="/css/bootstrap.min.css" rel="stylesheet" media="screen">
+  <link href="/css/bootstrap-responsive.css" rel="stylesheet">
+  <link rel="stylesheet" href="/css/global.css" type="text/css">
+
+  <!-- mathJax stuff -- use `\(...\)` for inline style math in markdown -->
+  <script type="text/x-mathjax-config">
+  MathJax.Hub.Config({
+    tex2jax: {
+      skipTags: ['script', 'noscript', 'style', 'textarea', 'pre']
+    }
+  });
+  MathJax.Hub.Queue(function() {
+    var all = MathJax.Hub.getAllJax(), i;
+    for(i = 0; i < all.length; i += 1) {
+      all[i].SourceElement().parentNode.className += ' has-jax';
+    }
+  });
+  </script>
+  <script type="text/javascript">
+    var mathjax = document.createElement('script'); 
+    mathjax.type = 'text/javascript'; 
+    mathjax.async = true;
+
+    mathjax.src = ('https:' == document.location.protocol) ?
+        'https://c328740.ssl.cf1.rackcdn.com/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML'
: 
+        'http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML';
+	
+	  var s = document.getElementsByTagName('script')[0]; 
+    s.parentNode.insertBefore(mathjax, s);
+  </script>
+</head>
+
+<body id="home" data-twttr-rendered="true">
+  <div id="wrap">
+   <div id="header">
+    <div id="logo"><a href="/overview.html"></a></div>
+  <div id="search">
+    <form id="search-form" action="http://www.google.com/search" method="get" class="navbar-search
pull-right">    
+      <input value="http://mahout.apache.org" name="sitesearch" type="hidden">
+      <input class="search-query" name="q" id="query" type="text">
+      <input id="submission" type="image" src="/images/mahout-lupe.png" alt="Search" />
+    </form>
+  </div>
+
+    <div class="navbar navbar-inverse" style="position:absolute;top:133px;padding-right:0px;padding-left:0px;">
+      <div class="navbar-inner" style="border: none; background: #999; border: none; border-radius:
0px;">
+        <div class="container">
+          <button type="button" class="btn btn-navbar" data-toggle="collapse" data-target=".nav-collapse">
+            <span class="icon-bar"></span>
+            <span class="icon-bar"></span>
+            <span class="icon-bar"></span>
+          </button>
+          <!-- <a class="brand" href="#">Apache Community Development Project</a>
-->
+          <div class="nav-collapse collapse">
+            <ul class="nav">
+             <!-- <li><a href="/">Home</a></li> --> 
+              <li class="dropdown"> <a href="#" class="dropdown-toggle" data-toggle="dropdown">General<b
class="caret"></b></a>
+                <ul class="dropdown-menu">
+                  <li><a href="/general/downloads.html">Downloads</a>
+                  <li><a href="/general/who-we-are.html">Who we are</a>
+                  <li><a href="/general/mailing-lists,-irc-and-archives.html">Mailing
Lists</a>
+                  <li><a href="/general/release-notes.html">Release Notes</a>

+                  <li><a href="/general/books-tutorials-and-talks.html">Books,
Tutorials, Talks</a></li>
+                  <li><a href="/general/powered-by-mahout.html">Powered By Mahout</a>
+                  <li><a href="/general/professional-support.html">Professional
Support</a>
+                  <li class="divider"></li>
+                  <li class="nav-header">Resources</li>
+                  <li><a href="/general/reference-reading.html">Reference Reading</a>
+                  <li><a href="/general/faq.html">FAQ</a>
+                  <li class="divider"></li>
+                  <li class="nav-header">Legal</li>
+                  <li><a href="http://www.apache.org/licenses/">License</a></li>
+                  <li><a href="http://www.apache.org/security/">Security</a></li>
+                  <li><a href="/general/privacy-policy.html">Privacy Policy</a>
+                </ul>
+              </li>
+              <li class="dropdown"> <a href="#" class="dropdown-toggle" data-toggle="dropdown">Developers<b
class="caret"></b></a>
+                <ul class="dropdown-menu">
+                  <li><a href="/developers/developer-resources.html">Developer
resources</a></li>
+                  <li><a href="/developers/version-control.html">Version control</a></li>
+                  <li><a href="/developers/buildingmahout.html">Build from source</a></li>
+                  <li><a href="/developers/issue-tracker.html">Issue tracker</a></li>
+                  <li><a href="https://builds.apache.org/job/Mahout-Quality/" target="_blank">Code
quality reports</a></li>
+                  <li class="divider"></li>
+                  <li class="nav-header">Contributions</li>
+                  <li><a href="/developers/how-to-contribute.html">How to contribute</a></li>
+                  <li><a href="/developers/how-to-become-a-committer.html">How
to become a committer</a></li>
+                  <li><a href="/developers/gsoc.html">GSoC</a></li>
+                  <li class="divider"></li>
+                  <li class="nav-header">For committers</li>
+                  <li><a href="/developers/how-to-update-the-website.html">How
to update the website</a></li>
+                  <li><a href="/developers/patch-check-list.html">Patch check
list</a></li>
+                  <li><a href="/developers/github.html">Handling Github PRs</a></li>
+                  <li><a href="/developers/how-to-release.html">How to release</a></li>
+                  <li><a href="/developers/thirdparty-dependencies.html">Third
party dependencies</a></li>
+                </ul>
+               </li>
+               <li class="dropdown"> <a href="#" class="dropdown-toggle" data-toggle="dropdown">Mahout
Environment<b class="caret"></b></a>
+                <ul class="dropdown-menu">
+                  <li><a href="/users/sparkbindings/home.html">Scala &amp;
Spark Bindings Overview</a></li>
+                  <li><a href="/users/sparkbindings/faq.html">FAQ</a></li>
+                  <li class="nav-header">Engines</li>
+                  <li><a href="/users/sparkbindings/home.html">Spark</a></li>
+                  <li><a href="/users/environment/h2o-internals.html">H2O</a></li>
+                  <li class="nav-header">Tutorials</li>
+                  <li><a href="/users/sparkbindings/play-with-shell.html">Playing
with Mahout's Spark Shell</a></li>
+                </ul>
+              </li>
+              <li class="dropdown"> <a href="#" class="dropdown-toggle" data-toggle="dropdown">Algorithms<b
class="caret"></b></a>
+                <ul class="dropdown-menu">
+                  <li><a href="/users/basics/algorithms.html">List of algorithms</a>
+                  <li class="nav-header">Distributed Matrix Decomposition</li>
+                  <li><a href="/users/algorithms/d-qr.html">Cholesky QR</a></li>
+                  <li><a href="/users/algorithms/d-ssvd.html">SSVD</a></li>
+                  <li class="nav-header">Recommendations</li>
+                  <li><a href="/users/algorithms/recommender-overview.html">Recommender
Overview</a></li>
+                  <li><a href="/users/algorithms/intro-cooccurrence-spark.html">Intro
to cooccurrence-based<br/> recommendations with Spark</a></li>
+                  <li class="nav-header">Classification</li>
+                  <li><a href="/users/algorithms/spark-naive-bayes.html">Spark
Naive Bayes</a></li>
+                </ul>
+               </li>
+               <li class="dropdown"> <a href="#" class="dropdown-toggle" data-toggle="dropdown">MapReduce
Basics<b class="caret"></b></a>
+                 <ul class="dropdown-menu">
+                  <li><a href="/users/basics/algorithms.html">List of algorithms</a>
+                  <li><a href="/users/basics/quickstart.html">Overview</a>
+                  <li class="divider"></li>
+                  <li class="nav-header">Working with text</li>
+                  <li><a href="/users/basics/creating-vectors-from-text.html">Creating
vectors from text</a>
+                  <li><a href="/users/basics/collocations.html">Collocations</a>
+                  <li class="divider"></li>
+                  <li class="nav-header">Dimensionality reduction</li>
+                  <li><a href="/users/dim-reduction/dimensional-reduction.html">Singular
Value Decomposition</a></li>
+                  <li><a href="/users/dim-reduction/ssvd.html">Stochastic SVD</a></li>
+                  <li class="divider"></li>
+                  <li class="nav-header">Topic Models</li>      
+                  <li><a href="/users/clustering/latent-dirichlet-allocation.html">Latent
Dirichlet Allocation</a></li>
+                </ul>
+               </li>
+               <li class="dropdown"> <a href="#" class="dropdown-toggle" data-toggle="dropdown">Mahout
MapReduce<b class="caret"></b></a>
+                <ul class="dropdown-menu">
+                <li class="nav-header">Classification</li>
+                  <li><a href="/users/classification/bayesian.html">Naive Bayes</a></li>
+                  <li><a href="/users/classification/hidden-markov-models.html">Hidden
Markov Models</a></li>
+                  <li><a href="/users/classification/logistic-regression.html">Logistic
Regression</a></li>
+                  <li><a href="/users/classification/partial-implementation.html">Random
Forest</a></li>
+                  <li class="nav-header">Classification Examples</li>
+                  <li><a href="/users/classification/breiman-example.html">Breiman
example</a></li>
+                  <li><a href="/users/classification/twenty-newsgroups.html">20
newsgroups example</a></li>
+                  <li><a href="/users/classification/bankmarketing-example.html">SGD
classifier bank marketing</a></li>
+                  <li class="nav-header">Clustering</li>
+                  <li><a href="/users/clustering/k-means-clustering.html">k-Means</a></li>
+                  <li><a href="/users/clustering/canopy-clustering.html">Canopy</a></li>
+                  <li><a href="/users/clustering/fuzzy-k-means.html">Fuzzy k-Means</a></li>
+                  <li><a href="/users/clustering/streaming-k-means.html">Streaming
KMeans</a></li>
+                  <li><a href="/users/clustering/spectral-clustering.html">Spectral
Clustering</a></li>
+                  <li class="nav-header">Clustering Commandline usage</li>
+                  <li><a href="/users/clustering/k-means-commandline.html">Options
for k-Means</a></li>
+                  <li><a href="/users/clustering/canopy-commandline.html">Options
for Canopy</a></li>
+                  <li><a href="/users/clustering/fuzzy-k-means-commandline.html">Options
for Fuzzy k-Means</a></li>
+                  <li class="nav-header">Clustering Examples</li>
+                  <li><a href="/users/clustering/clustering-of-synthetic-control-data.html">Synthetic
data</a></li>
+                  <li class="nav-header">Cluster Post processing</li>
+                  <li><a href="/users/clustering/cluster-dumper.html">Cluster
Dumper tool</a></li>
+                  <li><a href="/users/clustering/visualizing-sample-clusters.html">Cluster
visualisation</a></li>
+                  <li class="nav-header">Recommendations</li>
+                  <li><a href="/users/recommender/recommender-first-timer-faq.html">First
Timer FAQ</a></li>
+                  <li><a href="/users/recommender/userbased-5-minutes.html">A
user-based recommender <br/>in 5 minutes</a></li>
+		  <li><a href="/users/recommender/matrix-factorization.html">Matrix factorization-based<br/>
recommenders</a></li>
+                  <li><a href="/users/recommender/recommender-documentation.html">Overview</a></li>
+                  <li><a href="/users/recommender/intro-itembased-hadoop.html">Intro
to item-based recommendations<br/> with Hadoop</a></li>
+                  <li><a href="/users/recommender/intro-als-hadoop.html">Intro
to ALS recommendations<br/> with Hadoop</a></li>
+               </ul>
+              </li>
+              <!--  <li class="dropdown"> <a href="#" class="dropdown-toggle"
data-toggle="dropdown">Recommendations<b class="caret"></b></a>
+                <ul class="dropdown-menu">
+                
+                </ul> -->
+            </li>
+           </ul>
+          </div><!--/.nav-collapse -->
+        </div>
+      </div>
+    </div>
+
+</div>
+
+ <div id="sidebar">
+  <div id="sidebar-wrap">
+    <h2>Twitter</h2>
+	<ul class="sidemenu">
+		<li>
+<a class="twitter-timeline" href="https://twitter.com/ApacheMahout" data-widget-id="422861673444028416">Tweets
by @ApacheMahout</a>
+<script>!function(d,s,id){var js,fjs=d.getElementsByTagName(s)[0],p=/^http:/.test(d.location)?'http':'https';if(!d.getElementById(id)){js=d.createElement(s);js.id=id;js.src=p+"://platform.twitter.com/widgets.js";fjs.parentNode.insertBefore(js,fjs);}}(document,"script","twitter-wjs");</script>
+</li>
+	</ul>
+    <h2>Apache Software Foundation</h2>
+    <ul class="sidemenu">
+      <li><a href="http://www.apache.org/foundation/how-it-works.html">How the
ASF works</a></li>
+      <li><a href="http://www.apache.org/foundation/getinvolved.html">Get Involved</a></li>
+      <li><a href="http://www.apache.org/dev/">Developer Resources</a></li>
+      <li><a href="http://www.apache.org/foundation/sponsorship.html">Sponsorship</a></li>
+      <li><a href="http://www.apache.org/foundation/thanks.html">Thanks</a></li>
+    </ul>
+    <h2>Related Projects</h2>
+    <ul class="sidemenu">
+      <li><a href="http://lucene.apache.org/">Lucene</a></li>
+      <li><a href="http://hadoop.apache.org/">Hadoop</a></li>
+    </ul>
+  </div>
+</div>
+
+  <div id="content-wrap" class="clearfix">
+   <div id="main">
+    <h1 id="multilayer-perceptron">Multilayer Perceptron</h1>
+<p>A multilayer perceptron is a biologically inspired feed-forward network that can

+be trained to represent a nonlinear mapping between input and output data. It 
+consists of multiple layers, each containing multiple artificial neuron units and
+can be used for classification and regression tasks in a supervised learning approach. </p>
+<h2 id="command-line-usage">Command line usage</h2>
+<p>The MLP implementation is currently located in the MapReduce-Legacy package. It
+can be used with the following commands: </p>
+<h1 id="model-training">model training</h1>
+<div class="codehilite"><pre>$ <span class="n">bin</span><span
class="o">/</span><span class="n">mahout</span> <span class="n">org</span><span
class="p">.</span><span class="n">apache</span><span class="p">.</span><span
class="n">mahout</span><span class="p">.</span><span class="n">classifier</span><span
class="p">.</span><span class="n">mlp</span><span class="p">.</span><span
class="n">TrainMultilayerPerceptron</span>
+</pre></div>
+
+
+<h1 id="model-usage">model usage</h1>
+<div class="codehilite"><pre>$ <span class="n">bin</span><span
class="o">/</span><span class="n">mahout</span> <span class="n">org</span><span
class="p">.</span><span class="n">apache</span><span class="p">.</span><span
class="n">mahout</span><span class="p">.</span><span class="n">classifier</span><span
class="p">.</span><span class="n">mlp</span><span class="p">.</span><span
class="n">RunMultilayerPerceptron</span>
+</pre></div>
+
+
+<p>To train and use the model, a number of parameters can be specified. Parameters
without default values have to be specified by the user. Consider that not all parameters
can be used both for training and running the model. We give an example of the usage below.</p>
+<h3 id="parameters">Parameters</h3>
+<table>
+<thead>
+<tr>
+<th align="left">Command</th>
+<th align="right">Default</th>
+<th align="left">Description</th>
+<th align="left">Type</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+<td align="left">--input -i</td>
+<td align="right"></td>
+<td align="left">Path to the input data (currently, only .csv-files are allowed)</td>
+<td align="left"></td>
+</tr>
+<tr>
+<td align="left">--skipHeader -sh</td>
+<td align="right">false</td>
+<td align="left">Skip first row of the input file (corresponds to the csv headers)</td>
+<td align="left"></td>
+</tr>
+<tr>
+<td align="left">--update -u</td>
+<td align="right">false</td>
+<td align="left">Whether the model should be updated incrementally with every new training
instance. If this parameter is not given, the model is trained from scratch.</td>
+<td align="left">training</td>
+</tr>
+<tr>
+<td align="left">--labels -labels</td>
+<td align="right"></td>
+<td align="left">Instance labels separated by whitespaces.</td>
+<td align="left">training</td>
+</tr>
+<tr>
+<td align="left">--model -mo</td>
+<td align="right"></td>
+<td align="left">Location where the model will be stored / is stored (if the specified
location has an existing model, it will update the model through incremental learning).</td>
+<td align="left"></td>
+</tr>
+<tr>
+<td align="left">--layerSize -ls</td>
+<td align="right"></td>
+<td align="left">Number of units per layer, including input, hidden and ouput layers.
This parameter specifies the topology of the network (see <a href="mlperceptron_structure.png"
title="Architecture of a three-layer MLP">this image</a> for an example specified
by <code>-ls 4 8 3</code>).</td>
+<td align="left">training</td>
+</tr>
+<tr>
+<td align="left">--squashingFunction -sf</td>
+<td align="right">Sigmoid</td>
+<td align="left">The squashing function to use for the units. Currently only the sigmoid
fucntion is available.</td>
+<td align="left">training</td>
+</tr>
+<tr>
+<td align="left">--learningRate -l</td>
+<td align="right">0.5</td>
+<td align="left">The learning rate that is used for weight updates.</td>
+<td align="left">training</td>
+</tr>
+<tr>
+<td align="left">--momemtumWeight -m</td>
+<td align="right">0.1</td>
+<td align="left">The momentum weight that is used for gradient descent. Must be in
the range between 0 ... 1.0</td>
+<td align="left">training</td>
+</tr>
+<tr>
+<td align="left">--regularizationWeight -r</td>
+<td align="right">0</td>
+<td align="left">Regularization value for the weight vector. Must be in the range between
0 ... 0.1</td>
+<td align="left">training</td>
+</tr>
+<tr>
+<td align="left">--format -f</td>
+<td align="right">csv</td>
+<td align="left">Input file format. Currently only csv is supported.</td>
+<td align="left"></td>
+</tr>
+<tr>
+<td align="left">--columnRange -cr</td>
+<td align="right"></td>
+<td align="left">Range of the columns to use from the input file, starting with 0 (i.e.
<code>-cr 0 5</code> for including the first six columns only)</td>
+<td align="left">testing</td>
+</tr>
+<tr>
+<td align="left">--output -o</td>
+<td align="right"></td>
+<td align="left">Path to store the labeled results from running the model.</td>
+<td align="left">testing</td>
+</tr>
+</tbody>
+</table>
+<h2 id="example-usage">Example usage</h2>
+<p>In this example, we will train a multilayer perceptron for classification on the
iris data set. The iris flower data set contains data of three flower species where each datapoint
consists of four features.
+The dimensions of the data set are given through some flower parameters (sepal length, sepal
width, ...). All samples contain a label that indicates the flower species they belong to.</p>
+<h3 id="training">Training</h3>
+<p>To train our multilayer perceptron model from the command line, we call the following
command</p>
+<div class="codehilite"><pre>$ <span class="n">bin</span><span
class="o">/</span><span class="n">mahout</span> <span class="n">org</span><span
class="p">.</span><span class="n">apache</span><span class="p">.</span><span
class="n">mahout</span><span class="p">.</span><span class="n">classifier</span><span
class="p">.</span><span class="n">mlp</span><span class="p">.</span><span
class="n">TrainMultilayerPerceptron</span> <span class="o">\</span>
+            <span class="o">-</span><span class="nb">i</span> <span
class="o">./</span><span class="n">mrlegacy</span><span class="o">/</span><span
class="n">src</span><span class="o">/</span><span class="n">test</span><span
class="o">/</span><span class="n">resources</span><span class="o">/</span><span
class="n">iris</span><span class="p">.</span><span class="n">csv</span>
<span class="o">-</span><span class="n">sh</span> <span class="o">\</span>
+            <span class="o">-</span><span class="n">labels</span>
<span class="n">setosa</span> <span class="n">versicolor</span> <span
class="n">virginica</span> <span class="o">\</span>
+            <span class="o">-</span><span class="n">mo</span> <span
class="o">/</span><span class="n">tmp</span><span class="o">/</span><span
class="n">model</span><span class="p">.</span><span class="n">model</span>
<span class="o">-</span><span class="n">ls</span> 4 8 3 <span class="o">-</span><span
class="n">l</span> 0<span class="p">.</span>2 <span class="o">-</span><span
class="n">m</span> 0<span class="p">.</span>35 <span class="o">-</span><span
class="n">r</span> 0<span class="p">.</span>0001
+</pre></div>
+
+
+<p>The individual parameters are explained in the following.</p>
+<ul>
+<li><code>-i ./mrlegacy/src/test/resources/iris.csv</code> use the iris
data set as input data</li>
+<li><code>-sh</code> since the file <code>iris.csv</code> contains
a header row, this row needs to be skipped </li>
+<li><code>-labels setosa versicolor virginica</code> we specify, which
class labels should be learnt (which are the flower species in this case)</li>
+<li><code>-mo /tmp/model.model</code> specify where to store the model
file</li>
+<li><code>-ls 4 8 3</code> we specify the structure and depth of our layers.
The actual network structure can be seen in the figure below.</li>
+<li><code>-l 0.2</code> we set the learning rate to <code>0.2</code></li>
+<li><code>-m 0.35</code> momemtum weight is set to <code>0.35</code></li>
+<li><code>-r 0.0001</code> regularization weight is set to <code>0.0001</code></li>
+</ul>
+<table>
+<thead>
+<tr>
+<th></th>
+<th></th>
+</tr>
+</thead>
+<tbody>
+<tr>
+<td>The picture shows the architecture defined by the above command. The topolgy of
the network is completely defined through the number of layers and units because in this implementation
of the MLP every unit is fully connected to the units of the next and previous layer. Bias
units are added automatically.</td>
+<td><img alt="Multilayer perceptron network" src="mlperceptron_structure.png" title="Architecture
of a three-layer MLP" /></td>
+</tr>
+</tbody>
+</table>
+<h3 id="testing">Testing</h3>
+<p>To test / run the multilayer perceptron classification on the trained model, we
can use the following command</p>
+<div class="codehilite"><pre>$ <span class="n">bin</span><span
class="o">/</span><span class="n">mahout</span> <span class="n">org</span><span
class="p">.</span><span class="n">apache</span><span class="p">.</span><span
class="n">mahout</span><span class="p">.</span><span class="n">classifier</span><span
class="p">.</span><span class="n">mlp</span><span class="p">.</span><span
class="n">RunMultilayerPerceptron</span> <span class="o">\</span>
+            <span class="o">-</span><span class="nb">i</span> <span
class="o">./</span><span class="n">mrlegacy</span><span class="o">/</span><span
class="n">src</span><span class="o">/</span><span class="n">test</span><span
class="o">/</span><span class="n">resources</span><span class="o">/</span><span
class="n">iris</span><span class="p">.</span><span class="n">csv</span>
<span class="o">-</span><span class="n">sh</span> <span class="o">-</span><span
class="n">cr</span> 0 3 <span class="o">\</span>
+            <span class="o">-</span><span class="n">mo</span> <span
class="o">/</span><span class="n">tmp</span><span class="o">/</span><span
class="n">model</span><span class="p">.</span><span class="n">model</span>
<span class="o">-</span><span class="n">o</span> <span class="o">/</span><span
class="n">tmp</span><span class="o">/</span><span class="n">labelResult</span><span
class="p">.</span><span class="n">txt</span>
+</pre></div>
+
+
+<p>The individual parameters are explained in the following.</p>
+<ul>
+<li><code>-i ./mrlegacy/src/test/resources/iris.csv</code> use the iris
data set as input data</li>
+<li><code>-sh</code> since the file <code>iris.csv</code> contains
a header row, this row needs to be skipped</li>
+<li><code>-cr 0 3</code> we specify the column range of the input file</li>
+<li><code>-mo /tmp/model.model</code> specify where the model file is stored</li>
+<li><code>-o /tmp/labelResult.txt</code> specify where the labeled output
file will be stored</li>
+</ul>
+<h2 id="implementation">Implementation</h2>
+<p>The Multilayer Perceptron implementation is based on a more general Neural Network
class. Command line support was added later on and provides a simple usage of the MLP as shown
in the example. It is implemented to run on a single machine using stochastic gradient descent
where the weights are updated using one datapoint at a time, resulting in a weight update
of the form:
+$$ \vec{w}^{(t + 1)} = \vec{w}^{(t)} - n \Delta E_n(\vec{w}^{(t)}) $$</p>
+<p>where <em>a</em> is the activation of the unit. It is not yet possible
to change the learning to more advanced methods using adaptive learning rates yet. </p>
+<p>The number of layers and units per layer can be specified manually and determines
the whole topology with each unit being fully connected to the previous layer. A bias unit
is automatically added to the input of every layer. 
+Currently, the logistic sigmoid is used as a squashing function in every hidden and output
layer. It is of the form:</p>
+<p>$$ \frac{1}{1 + exp(-a)} $$</p>
+<p>The command line version <strong>does not perform iterations</strong>
which leads to bad results on small datasets. Another restriction is, that the CLI version
of the MLP only supports classification, since the labels have to be given explicitly when
executing on the command line. </p>
+<p>A learned model can be stored and updated with new training instanced using the
<code>--update</code> flag. Output of classification reults is saved as a .txt-file
and only consists of the assigned labels. Apart from the command-line interface, it is possible
to construct and compile more specialized neural networks using the API and interfaces in
the mrlegacy package. </p>
+<h2 id="theoretical-background">Theoretical Background</h2>
+<p>The <em>multilayer perceptron</em> was inspired by the biological structure
of the brain where multiple neurons are connected and form columns and layers. Perceptual
input enters this network through our sensory organs and is then further processed into higher
levels. 
+The term multilayer perceptron is a little misleading since the <em>perceptron</em>
is a special case of a single <em>artificial neuron</em> that can be used for
simple computations <a href="http://en.wikipedia.org/wiki/Perceptron" title="The perceptron
in wikipedia">[1]</a>. The difference is that the perceptron uses a discontinous
nonlinearity while for the MLP neurons that are implemented in mahout it is important to use
continous nonlinearities. This is necessary for the implemented learning algorithm, where
the error is propagated back from the output layer to the input layer and the weights of the
connections are changed according to their contribution to the overall error. This algorithm
is called backpropagation and uses gradient descent to update the weights. To compute the
gradients we need continous nonlinearities. But let's start from the beginning!</p>
+<p>The first layer of the MLP represents the input and has no other purpose than routing
the input to every connected unit in a feed-forward fashion. Following layers are called hidden
layers and the last layer serves the special purpose to determine the output. The activation
of a unit <em>u</em> in a hidden layer is computed through a weighted sum of all
inputs, resulting in 
+$$ a_j = \sum_{i=1}^{D} w_{ji}^{(l)} x_i + w_{j0}^{(l)} $$
+This computes the activation <em>a</em> for neuron <em>j</em> where
<em>w</em> is the weight from neuron <em>i</em> to neuron <em>j</em>
in layer <em>l</em>. The last part, where <em>i = 0</em> is called
the bias and can be used as an offset, independent from the input.</p>
+<p>The activation is then transformed by the aforementioned differentiable, nonlinear
<em>activation function</em> and serves as the input to the next layer. The activation
function is usually chosen from the family of sigmoidal functions such as <em>tanh</em>
or <em>logistic sigmoidal</em> <a href="http://en.wikipedia.org/wiki/Sigmoid_function"
title="Sigmoid function on wikipedia">[2]</a>. Often sigmoidal and logistic sigmoidal
are used synonymous. Another word for the activation function is <em>squashing function</em>
since the s-shape of this function class <em>squashes</em> the input.</p>
+<p>For different units or layers, different activation functions can be used to obtain
different behaviors. Especially in the output layer, the activation function can be chosen
to obtain the output value <em>y</em>, depending on the learning problem:
+$$ y_k = \sigma (a_k) $$</p>
+<p>If the learning problem is a linear regression task, sigma can be chosen to be the
identity function. In case of classification problems, the choice of the squashing functions
depends on the exact task at hand and often softmax activation functions are used. </p>
+<p>The equation for a MLP with three layers (one input, one hidden and one output)
is then given by</p>
+<p>$$ y_k(\vec{x}, \vec{w}) = h \left( \sum_{j=1}^{M} w_{kj}^{(2)} h \left( \sum_{i=1}^{D}
w_{ji}^{(1)} x_i + w_{j0}^{(1)} \right) + w_{k0}^{(2)} \right) $$ </p>
+<p>where <em>h</em> indicates the respective squashing function that is
used in the units of a layer. <em>M</em> and <em>D</em> specify the
number of incoming connections to a unit and we can see that the input to the first layer
(hidden layer) is just the original input <em>x</em> whereas the input into the
second layer (output layer) is the transformed output of layer one. The output <em>y</em>
of unit <em>k</em> is therefore given by the above equation and depends on the
input <em>x</em> and the weight vector <em>w</em>. This shows us,
that the parameter that we can optimize during learning is <em>w</em> since we
can not do anything about the input <em>x</em>. To facilitate the following steps,
we can include the bias-terms into the weight vector and correct for the indices by adding
another dimension with the value 1 to the input vector. The bias is a constant factor that
is added to the weighted sum and that serves as a scaling factor of the nonlinear transformation.
Including 
 it into the weight vector leads to:</p>
+<p>$$ y_k(\vec{x}, \vec{w}) = h \left( \sum_{j=0}^{M} w_{kj}^{(2)} h \left( \sum_{i=0}^{D}
w_{ji}^{(1)} x_i \right) \right) $$ </p>
+<p>The previous paragraphs described how the MLP transforms a given input into some
output using a combination of different nonlinear functions. Of course what we really want
is to learn the structure of our data so that we can feed data with unknown labels into the
network and get the estimated target labels <em>t</em>. To achieve this, we have
to train our network. In this context, training means optimizing some function such that the
error between the real labels <em>y</em> and the network-output <em>t</em>
becomes smallest. We have seen in the previous pragraph, that our only knob to change is the
weight vector <em>w</em>, making the function to be optimized a function of <em>w</em>.
For simplicitly and because it is widely used, we choose the so called <em>sum-of-squares</em>
error function as an example that is given by</p>
+<p>$$ E(\vec{w}) = \frac{1}{2} \sum_{n=1}^N \left( y(\vec{x}_n, \vec{w}) - t_n \right)^2
$$</p>
+<p>The goal is to minimize this function and thereby increase the performance of our
model. A common method to achieve this is to use gradient descent and the so called technique
of <em>backpropagation</em> where the goal is to compute the contribution of every
unit to the overall error and changing the weight according to this contribution and into
the direction of the gradient of the error function at this particular unit. In the following
we try to give a short overview of the model training with gradient descent and backpropagation.
A more detailed example can be found in <a href="http://research.microsoft.com/en-us/um/people/cmbishop/prml/"
title="Christopher M. Bishop: Pattern Recognition and Machine Learning, Springer 2009">[3]</a>
where much of this information is taken from.</p>
+<p>The problem with minimizing the error function is that the error can only be computed
at the output layers where we get <em>t</em>, but we want to update all the weights
of all the units. Therefore we use the technique of backpropagation to propagate the error,
that we first compute at the output layer, back to the units of the previous layers. For this
approach we also need to compute the gradients of the activation function. </p>
+<p>Weights are then updated with a small step in the direction of the negative gradient,
regulated by the learning rate <em>n</em> such that we arrive at the formula for
weight update:</p>
+<p>$$ \vec{w}^{(t + 1)} = \vec{w}^{(t)} - n \Delta E(\vec{w}^{(t)}) $$</p>
+<p>A momentum weight can be set as a parameter of the gradient descent method to increase
the probability of finding better local or global optima of the error function.</p>
+<p>References</p>
+<p>[1] http://en.wikipedia.org/wiki/Perceptron</p>
+<p>[2] http://en.wikipedia.org/wiki/Sigmoid_function</p>
+<p>[3] <a href="http://research.microsoft.com/en-us/um/people/cmbishop/prml/">Christopher
M. Bishop: Pattern Recognition and Machine Learning, Springer 2009</a></p>
+   </div>
+  </div>     
+</div> 
+  <footer class="footer" align="center">
+    <div class="container">
+      <p>
+        Copyright &copy; 2014 The Apache Software Foundation, Licensed under
+        the <a href="http://www.apache.org/licenses/LICENSE-2.0">Apache License, Version
2.0</a>.
+        <br />
+        Apache and the Apache feather logos are trademarks of The Apache Software Foundation.
+      </p>
+    </div>
+  </footer>
+  
+  <script src="/js/jquery-1.9.1.min.js"></script>
+  <script src="/js/bootstrap.min.js"></script>
+  <script>
+    (function() {
+      var cx = '012254517474945470291:vhsfv7eokdc';
+      var gcse = document.createElement('script');
+      gcse.type = 'text/javascript';
+      gcse.async = true;
+      gcse.src = (document.location.protocol == 'https:' ? 'https:' : 'http:') +
+          '//www.google.com/cse/cse.js?cx=' + cx;
+      var s = document.getElementsByTagName('script')[0];
+      s.parentNode.insertBefore(gcse, s);
+    })();
+  </script>
+</body>
+</html>



Mime
View raw message