mahout-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From rawkintr...@apache.org
Subject mahout git commit: [WEBSITE] Move BuildingMahout.md
Date Wed, 29 Nov 2017 19:25:25 GMT
Repository: mahout
Updated Branches:
  refs/heads/master e59101243 -> fe77fc19f


[WEBSITE] Move BuildingMahout.md


Project: http://git-wip-us.apache.org/repos/asf/mahout/repo
Commit: http://git-wip-us.apache.org/repos/asf/mahout/commit/fe77fc19
Tree: http://git-wip-us.apache.org/repos/asf/mahout/tree/fe77fc19
Diff: http://git-wip-us.apache.org/repos/asf/mahout/diff/fe77fc19

Branch: refs/heads/master
Commit: fe77fc19fc0c4d0c05c55a30a473acc71e30f1de
Parents: e591012
Author: Trevor a.k.a @rawkintrevo <trevor.d.grant@gmail.com>
Authored: Wed Nov 29 13:25:14 2017 -0600
Committer: Trevor a.k.a @rawkintrevo <trevor.d.grant@gmail.com>
Committed: Wed Nov 29 13:25:14 2017 -0600

----------------------------------------------------------------------
 website/build_site.sh                        |   5 +
 website/oldsite/developers/buildingmahout.md | 187 ++++++++++++++++++----
 2 files changed, 164 insertions(+), 28 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/mahout/blob/fe77fc19/website/build_site.sh
----------------------------------------------------------------------
diff --git a/website/build_site.sh b/website/build_site.sh
old mode 100755
new mode 100644
index 0a66962..d8502d8
--- a/website/build_site.sh
+++ b/website/build_site.sh
@@ -19,6 +19,7 @@ export PATH=${GEM_HOME}/bin:$PATH
 (cd docs && bundle)
 (cd docs && bundle exec jekyll build --destination $WORKDIR/docs/latest)
 
+
 # Set env for docs
 MAHOUT_VERSION=0.13.0
 DISTFILE=apache-mahout-distribution-$MAHOUT_VERSION.tar.gz
@@ -37,4 +38,8 @@ rm -rf *
 cp -a $WORKDIR/* .
 git add .
 git commit -m "Automatic Site Publish by Buildbot"
+<<<<<<< HEAD
+git push origin asf-site
+=======
 git push origin asf-site
+>>>>>>> e591012439c04e98d669ef9732fde865a9ef76fa

http://git-wip-us.apache.org/repos/asf/mahout/blob/fe77fc19/website/oldsite/developers/buildingmahout.md
----------------------------------------------------------------------
diff --git a/website/oldsite/developers/buildingmahout.md b/website/oldsite/developers/buildingmahout.md
index 8e1e7f0..40b509b 100644
--- a/website/oldsite/developers/buildingmahout.md
+++ b/website/oldsite/developers/buildingmahout.md
@@ -1,16 +1,17 @@
 ---
 layout: default
-title: BuildingMahout
-theme:
-    name: retro-mahout
+title: Building Mahout
+theme: 
+    name: mahout2
 ---
 
-# Building Mahout from source
+
+# Building Mahout from Source
 
 ## Prerequisites
 
 * Java JDK 1.7
-* Apache Maven 3.3.3
+* Apache Maven 3.3.9
 
 
 ## Getting the source code
@@ -23,40 +24,170 @@ or
  
     git clone https://github.com/apache/mahout.git
 
-##Hadoop version
-Mahout code depends on hadoop-client artifact, with the default version 2.4.1. To build Mahout
against to a
-different hadoop version, hadoop.version property should be set accordingly and passed to
the build command.
-Hadoop1 clients would additionally require hadoop1 profile to be activated.
+## Building From Source
+
+###### Prerequisites:
+
+Linux Environment (preferably Ubuntu 16.04.x) Note: Currently only the JVM-only build will
work on a Mac.
+gcc > 4.x
+NVIDIA Card (installed with OpenCL drivers alongside usual GPU drivers)
+
+###### Downloads
+
+Install java 1.7+ in an easily accessible directory (for this example,  ~/java/)
+http://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html
+    
+Create a directory ~/apache/ .
+    
+Download apache Maven 3.3.9 and un-tar/gunzip to ~/apache/apache-maven-3.3.9/ .
+https://maven.apache.org/download.cgi
+        
+Download and un-tar/gunzip Hadoop 2.4.1 to ~/apache/hadoop-2.4.1/ .
+https://archive.apache.org/dist/hadoop/common/hadoop-2.4.1/    
+
+Download and un-tar/gunzip spark-1.6.3-bin-hadoop2.4 to  ~/apache/ .
+http://spark.apache.org/downloads.html
+Choose release: Spark-1.6.3 (Nov 07 2016)
+Choose package type: Pre-Built for Hadoop 2.4
+
+Install ViennaCL 1.7.0+
+If running Ubuntu 16.04+
+
+```
+sudo apt-get install libviennacl-dev
+```
+
+Otherwise if your distribution’s package manager does not have a viennniacl-dev package
>1.7.0, clone it directly into the directory which will be included in when  being compiled
by Mahout:
+
+```
+mkdir ~/tmp
+cd ~/tmp && git clone https://github.com/viennacl/viennacl-dev.git
+cp -r viennacl/ /usr/local/
+cp -r CL/ /usr/local/
+```
+
+Ensure that the OpenCL 1.2+ drivers are installed (packed with most consumer grade NVIDIA
drivers).  Not sure about higher end cards.
+
+Clone mahout repository into `~/apache`.
+
+```
+git clone https://github.com/apache/mahout.git
+```
+
+###### Configuration
+
+When building mahout for a spark backend, we need four System Environment variables set:
+```
+    export MAHOUT_HOME=/home/<user>/apache/mahout
+    export HADOOP_HOME=/home/<user>/apache/hadoop-2.4.1
+    export SPARK_HOME=/home/<user>/apache/spark-1.6.3-bin-hadoop2.4    
+    export JAVA_HOME=/home/<user>/java/jdk-1.8.121
+```
+
+Mahout on Spark regularly uses one more env variable, the IP of the Spark cluster’s master
node (usually the node which one would be logged into).
+
+To use 4 local cores (Spark master need not be running)
+```
+export MASTER=local[4]
+```
+To use all available local cores (again, Spark master need not be running)
+```
+export MASTER=local[*]
+```
+To point to a cluster with spark running: 
+```
+export MASTER=spark://master.ip.address:7077
+```
+
+We then add these to the path:
+
+```
+   PATH=$PATH$:MAHOUT_HOME/bin:$HADOOP_HOME/bin:$SPARK_HOME/bin:$JAVA_HOME/bin
+```
+
+These should be added to the your ~/.bashrc file.
+
+
+###### Building Mahout with Apache Maven
+
+From the  $MAHOUT_HOME directory we may issue the commands to build each using mvn profiles.
+
+JVM only:
+```
+mvn clean install -DskipTests
+```
+
+JVM with native OpenMP level 2 and level 3 matrix/vector Multiplication
+```
+mvn clean install -Pviennacl-omp -Phadoop2 -DskipTests
+```
+JVM with native OpenMP and OpenCL for Level 2 and level 3 matrix/vector Multiplication. 
(GPU errors fall back to OpenMP, currently only a single GPU/node is supported).
+```
+mvn clean install -Pviennacl -Phadoop2 -DskipTests
+```
+
+### Changing Scala Version
+
+To change the Scala version used it is possible to use profiles, however the resulting artifacts
seem to have trouble being resolved with SBT.
+
+```bash
+mvn clean install -Pscala-2.11
+```
+
+Maven is able to resolve the resulting artifacts effectively, this will also work if the
goal is simply to use the Mahout-Shell. However if the goal is to build with SBT, the following
tool should be used
+
+```bash
+cd $MAHOUT_HOME/buildtools
+./change-scala-version.sh 2.11
+```
+
+Now go back to `$MAHOUT_HOME` and execute
+
+```bash
+mvn clean install -Pscala-2.11
+```
+
+**NOTE:** you still need to pass the `-Pscala-2.11` profile, as this determines and propegates
the minor scala version (e.g. 2.11.8)
+
+
+### The Distribution Profile
 
-The build lifecycle is illustrated below. 
+The distribution profile, among other things, will produce the same artifact for multiple
Scala and Spark versions.
 
-## Compiling
+Specifically, in addition to creating all of the
 
-Compile Mahout using standard maven commands
+Default Targets:
+- Spark 1.6 Bindings, Scala-2.10
+- Mahout-Math Scala-2.10
+- ViennaCL Scala-2.10*
+- ViennaCL-OMP Scala-2.10*
+- H2O Scala-2.10
 
-    # With hadoop-2.4.1 dependency
-    mvn clean compile
+It will also create:
+- Spark 2.0 Bindings, Scala-2.11
+- Spark 2.1 Bindings, Scala-2.11
+- Mahout-Math Scala-2.11
+- ViennaCL Scala-2.11*
+- ViennaCL-OMP Scala-2.11*
+- H2O Scala-2.11
 
-    # With hadoop-1.2.1 dependency
-    mvn -Phadoop1 -Dhadoop.version=1.2.1 clean compile
+Note: * ViennaCLs are only created if the `viennacl` or `viennacl-omp` profiles are activated.
 
-##Packaging
+By default, this phase will execute the `package` lifecycle goal on all built "extra" varients.
 
-Mahout has an extensive test suite which takes some time to run. If you just want to build
Mahout, skip the tests like this
+E.g. if you were to run
 
-    # With hadoop-2.4.1 dependency
-    mvn -DskipTests=true clean package
+```bash
+mvn clean install -Pdistribution
+```
 
-    # With hadoop-1.2.1 dependency
-    mvn -Phadoop1 -Dhadoop.version=1.2.1 -DskipTests=true clean package
+You will `install` all of the "Default Targets" but only `package` the "Also created".
 
+If you wish to `install` all of the above, you can set the `lifecycle.target` switch as follows:
 
-In order to add mahout artifact to your local repository, run
+```bash
+mvn clean install -Pdistribution -Dlifecycle.target=install
+```
 
-    # With hadoop-2.4.1 dependency
-    mvn clean install
 
-    # With hadoop-1.2.1 dependency
-    mvn -Phadoop1 -Dhadoop.version=1.2.1 clean install
 
- 
\ No newline at end of file


Mime
View raw message