mahout-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From rawkintr...@apache.org
Subject [09/18] mahout git commit: Refactored to docs and front sub sites
Date Thu, 27 Apr 2017 16:47:07 GMT
http://git-wip-us.apache.org/repos/asf/mahout/blob/e2549b78/website/docs/tutorials/classify-a-doc-from-the-shell.md
----------------------------------------------------------------------
diff --git a/website/docs/tutorials/classify-a-doc-from-the-shell.md b/website/docs/tutorials/classify-a-doc-from-the-shell.md
new file mode 100644
index 0000000..0a237d1
--- /dev/null
+++ b/website/docs/tutorials/classify-a-doc-from-the-shell.md
@@ -0,0 +1,258 @@
+---
+layout: page
+title: Text Classification Example
+theme:
+    name: mahout2
+---
+
+# Building a text classifier in Mahout's Spark Shell
+
+This tutorial will take you through the steps used to train a Multinomial Naive Bayes model and create a text classifier based on that model using the ```mahout spark-shell```. 
+
+## Prerequisites
+This tutorial assumes that you have your Spark environment variables set for the ```mahout spark-shell``` see: [Playing with Mahout's Shell](http://mahout.apache.org/users/sparkbindings/play-with-shell.html).  As well we assume that Mahout is running in cluster mode (i.e. with the ```MAHOUT_LOCAL``` environment variable **unset**) as we'll be reading and writing to HDFS.
+
+## Downloading and Vectorizing the Wikipedia dataset
+*As of Mahout v. 0.10.0, we are still reliant on the MapReduce versions of ```mahout seqwiki``` and ```mahout seq2sparse``` to extract and vectorize our text.  A* [*Spark implementation of seq2sparse*](https://issues.apache.org/jira/browse/MAHOUT-1663) *is in the works for Mahout v. 0.11.* However, to download the Wikipedia dataset, extract the bodies of the documentation, label each document and vectorize the text into TF-IDF vectors, we can simpmly run the [wikipedia-classifier.sh](https://github.com/apache/mahout/blob/master/examples/bin/classify-wikipedia.sh) example.  
+
+    Please select a number to choose the corresponding task to run
+    1. CBayes (may require increased heap space on yarn)
+    2. BinaryCBayes
+    3. clean -- cleans up the work area in /tmp/mahout-work-wiki
+    Enter your choice :
+
+Enter (2). This will download a large recent XML dump of the Wikipedia database, into a ```/tmp/mahout-work-wiki``` directory, unzip it and  place it into HDFS.  It will run a [MapReduce job to parse the wikipedia set](http://mahout.apache.org/users/classification/wikipedia-classifier-example.html), extracting and labeling only pages with category tags for [United States] and [United Kingdom] (~11600 documents). It will then run ```mahout seq2sparse``` to convert the documents into TF-IDF vectors.  The script will also a build and test a [Naive Bayes model using MapReduce](http://mahout.apache.org/users/classification/bayesian.html).  When it is completed, you should see a confusion matrix on your screen.  For this tutorial, we will ignore the MapReduce model, and build a new model using Spark based on the vectorized text output by ```seq2sparse```.
+
+## Getting Started
+
+Launch the ```mahout spark-shell```.  There is an example script: ```spark-document-classifier.mscala``` (.mscala denotes a Mahout-Scala script which can be run similarly to an R script).   We will be walking through this script for this tutorial but if you wanted to simply run the script, you could just issue the command: 
+
+    mahout> :load /path/to/mahout/examples/bin/spark-document-classifier.mscala
+
+For now, lets take the script apart piece by piece.  You can cut and paste the following code blocks into the ```mahout spark-shell```.
+
+## Imports
+
+Our Mahout Naive Bayes imports:
+
+    import org.apache.mahout.classifier.naivebayes._
+    import org.apache.mahout.classifier.stats._
+    import org.apache.mahout.nlp.tfidf._
+
+Hadoop imports needed to read our dictionary:
+
+    import org.apache.hadoop.io.Text
+    import org.apache.hadoop.io.IntWritable
+    import org.apache.hadoop.io.LongWritable
+
+## Read in our full set from HDFS as vectorized by seq2sparse in classify-wikipedia.sh
+
+    val pathToData = "/tmp/mahout-work-wiki/"
+    val fullData = drmDfsRead(pathToData + "wikipediaVecs/tfidf-vectors")
+
+## Extract the category of each observation and aggregate those observations by category
+
+    val (labelIndex, aggregatedObservations) = SparkNaiveBayes.extractLabelsAndAggregateObservations(
+                                                                 fullData)
+
+## Build a Muitinomial Naive Bayes model and self test on the training set
+
+    val model = SparkNaiveBayes.train(aggregatedObservations, labelIndex, false)
+    val resAnalyzer = SparkNaiveBayes.test(model, fullData, false)
+    println(resAnalyzer)
+    
+printing the ```ResultAnalyzer``` will display the confusion matrix.
+
+## Read in the dictionary and document frequency count from HDFS
+    
+    val dictionary = sdc.sequenceFile(pathToData + "wikipediaVecs/dictionary.file-0",
+                                      classOf[Text],
+                                      classOf[IntWritable])
+    val documentFrequencyCount = sdc.sequenceFile(pathToData + "wikipediaVecs/df-count",
+                                                  classOf[IntWritable],
+                                                  classOf[LongWritable])
+
+    // setup the dictionary and document frequency count as maps
+    val dictionaryRDD = dictionary.map { 
+                                    case (wKey, wVal) => wKey.asInstanceOf[Text]
+                                                             .toString() -> wVal.get() 
+                                       }
+                                       
+    val documentFrequencyCountRDD = documentFrequencyCount.map {
+                                            case (wKey, wVal) => wKey.asInstanceOf[IntWritable]
+                                                                     .get() -> wVal.get() 
+                                                               }
+    
+    val dictionaryMap = dictionaryRDD.collect.map(x => x._1.toString -> x._2.toInt).toMap
+    val dfCountMap = documentFrequencyCountRDD.collect.map(x => x._1.toInt -> x._2.toLong).toMap
+
+## Define a function to tokenize and vectorize new text using our current dictionary
+
+For this simple example, our function ```vectorizeDocument(...)``` will tokenize a new document into unigrams using native Java String methods and vectorize using our dictionary and document frequencies. You could also use a [Lucene](https://lucene.apache.org/core/) analyzer for bigrams, trigrams, etc., and integrate Apache [Tika](https://tika.apache.org/) to extract text from different document types (PDF, PPT, XLS, etc.).  Here, however we will keep it simple, stripping and tokenizing our text using regexs and native String methods.
+
+    def vectorizeDocument(document: String,
+                            dictionaryMap: Map[String,Int],
+                            dfMap: Map[Int,Long]): Vector = {
+        val wordCounts = document.replaceAll("[^\\p{L}\\p{Nd}]+", " ")
+                                    .toLowerCase
+                                    .split(" ")
+                                    .groupBy(identity)
+                                    .mapValues(_.length)         
+        val vec = new RandomAccessSparseVector(dictionaryMap.size)
+        val totalDFSize = dfMap(-1)
+        val docSize = wordCounts.size
+        for (word <- wordCounts) {
+            val term = word._1
+            if (dictionaryMap.contains(term)) {
+                val tfidf: TermWeight = new TFIDF()
+                val termFreq = word._2
+                val dictIndex = dictionaryMap(term)
+                val docFreq = dfCountMap(dictIndex)
+                val currentTfIdf = tfidf.calculate(termFreq,
+                                                   docFreq.toInt,
+                                                   docSize,
+                                                   totalDFSize.toInt)
+                vec.setQuick(dictIndex, currentTfIdf)
+            }
+        }
+        vec
+    }
+
+## Setup our classifier
+
+    val labelMap = model.labelIndex
+    val numLabels = model.numLabels
+    val reverseLabelMap = labelMap.map(x => x._2 -> x._1)
+    
+    // instantiate the correct type of classifier
+    val classifier = model.isComplementary match {
+        case true => new ComplementaryNBClassifier(model)
+        case _ => new StandardNBClassifier(model)
+    }
+
+## Define an argmax function 
+
+The label with the highest score wins the classification for a given document.
+    
+    def argmax(v: Vector): (Int, Double) = {
+        var bestIdx: Int = Integer.MIN_VALUE
+        var bestScore: Double = Integer.MIN_VALUE.asInstanceOf[Int].toDouble
+        for(i <- 0 until v.size) {
+            if(v(i) > bestScore){
+                bestScore = v(i)
+                bestIdx = i
+            }
+        }
+        (bestIdx, bestScore)
+    }
+
+## Define our TF(-IDF) vector classifier
+
+    def classifyDocument(clvec: Vector) : String = {
+        val cvec = classifier.classifyFull(clvec)
+        val (bestIdx, bestScore) = argmax(cvec)
+        reverseLabelMap(bestIdx)
+    }
+
+## Two sample news articles: United States Football and United Kingdom Football
+    
+    // A random United States football article
+    // http://www.reuters.com/article/2015/01/28/us-nfl-superbowl-security-idUSKBN0L12JR20150128
+    val UStextToClassify = new String("(Reuters) - Super Bowl security officials acknowledge" +
+        " the NFL championship game represents a high profile target on a world stage but are" +
+        " unaware of any specific credible threats against Sunday's showcase. In advance of" +
+        " one of the world's biggest single day sporting events, Homeland Security Secretary" +
+        " Jeh Johnson was in Glendale on Wednesday to review security preparations and tour" +
+        " University of Phoenix Stadium where the Seattle Seahawks and New England Patriots" +
+        " will battle. Deadly shootings in Paris and arrest of suspects in Belgium, Greece and" +
+        " Germany heightened fears of more attacks around the world and social media accounts" +
+        " linked to Middle East militant groups have carried a number of threats to attack" +
+        " high-profile U.S. events. There is no specific credible threat, said Johnson, who" + 
+        " has appointed a federal coordination team to work with local, state and federal" +
+        " agencies to ensure safety of fans, players and other workers associated with the" + 
+        " Super Bowl. I'm confident we will have a safe and secure and successful event." +
+        " Sunday's game has been given a Special Event Assessment Rating (SEAR) 1 rating, the" +
+        " same as in previous years, except for the year after the Sept. 11, 2001 attacks, when" +
+        " a higher level was declared. But security will be tight and visible around Super" +
+        " Bowl-related events as well as during the game itself. All fans will pass through" +
+        " metal detectors and pat downs. Over 4,000 private security personnel will be deployed" +
+        " and the almost 3,000 member Phoenix police force will be on Super Bowl duty. Nuclear" +
+        " device sniffing teams will be deployed and a network of Bio-Watch detectors will be" +
+        " set up to provide a warning in the event of a biological attack. The Department of" +
+        " Homeland Security (DHS) said in a press release it had held special cyber-security" +
+        " and anti-sniper training sessions. A U.S. official said the Transportation Security" +
+        " Administration, which is responsible for screening airline passengers, will add" +
+        " screeners and checkpoint lanes at airports. Federal air marshals, behavior detection" +
+        " officers and dog teams will help to secure transportation systems in the area. We" +
+        " will be ramping it (security) up on Sunday, there is no doubt about that, said Federal"+
+        " Coordinator Matthew Allen, the DHS point of contact for planning and support. I have" +
+        " every confidence the public safety agencies that represented in the planning process" +
+        " are going to have their best and brightest out there this weekend and we will have" +
+        " a very safe Super Bowl.")
+    
+    // A random United Kingdom football article
+    // http://www.reuters.com/article/2015/01/26/manchester-united-swissquote-idUSL6N0V52RZ20150126
+    val UKtextToClassify = new String("(Reuters) - Manchester United have signed a sponsorship" +
+        " deal with online financial trading company Swissquote, expanding the commercial" +
+        " partnerships that have helped to make the English club one of the richest teams in" +
+        " world soccer. United did not give a value for the deal, the club's first in the sector," +
+        " but said on Monday it was a multi-year agreement. The Premier League club, 20 times" +
+        " English champions, claim to have 659 million followers around the globe, making the" +
+        " United name attractive to major brands like Chevrolet cars and sportswear group Adidas." +
+        " Swissquote said the global deal would allow it to use United's popularity in Asia to" +
+        " help it meet its targets for expansion in China. Among benefits from the deal," +
+        " Swissquote's clients will have a chance to meet United players and get behind the scenes" +
+        " at the Old Trafford stadium. Swissquote is a Geneva-based online trading company that" +
+        " allows retail investors to buy and sell foreign exchange, equities, bonds and other asset" +
+        " classes. Like other retail FX brokers, Swissquote was left nursing losses on the Swiss" +
+        " franc after Switzerland's central bank stunned markets this month by abandoning its cap" +
+        " on the currency. The fallout from the abrupt move put rival and West Ham United shirt" +
+        " sponsor Alpari UK into administration. Swissquote itself was forced to book a 25 million" +
+        " Swiss francs ($28 million) provision for its clients who were left out of pocket" +
+        " following the franc's surge. United's ability to grow revenues off the pitch has made" +
+        " them the second richest club in the world behind Spain's Real Madrid, despite a" +
+        " downturn in their playing fortunes. United Managing Director Richard Arnold said" +
+        " there was still lots of scope for United to develop sponsorships in other areas of" +
+        " business. The last quoted statistics that we had showed that of the top 25 sponsorship" +
+        " categories, we were only active in 15 of those, Arnold told Reuters. I think there is a" +
+        " huge potential still for the club, and the other thing we have seen is there is very" +
+        " significant growth even within categories. United have endured a tricky transition" +
+        " following the retirement of manager Alex Ferguson in 2013, finishing seventh in the" +
+        " Premier League last season and missing out on a place in the lucrative Champions League." +
+        " ($1 = 0.8910 Swiss francs) (Writing by Neil Maidment, additional reporting by Jemima" + 
+        " Kelly; editing by Keith Weir)")
+
+## Vectorize and classify our documents
+
+    val usVec = vectorizeDocument(UStextToClassify, dictionaryMap, dfCountMap)
+    val ukVec = vectorizeDocument(UKtextToClassify, dictionaryMap, dfCountMap)
+    
+    println("Classifying the news article about superbowl security (united states)")
+    classifyDocument(usVec)
+    
+    println("Classifying the news article about Manchester United (united kingdom)")
+    classifyDocument(ukVec)
+
+## Tie everything together in a new method to classify text 
+    
+    def classifyText(txt: String): String = {
+        val v = vectorizeDocument(txt, dictionaryMap, dfCountMap)
+        classifyDocument(v)
+    }
+
+## Now we can simply call our classifyText(...) method on any String
+
+    classifyText("Hello world from Queens")
+    classifyText("Hello world from London")
+    
+## Model persistance
+
+You can save the model to HDFS:
+
+    model.dfsWrite("/path/to/model")
+    
+And retrieve it with:
+
+    val model =  NBModel.dfsRead("/path/to/model")
+
+The trained model can now be embedded in an external application.
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/mahout/blob/e2549b78/website/docs/tutorials/how-to-build-an-app.md
----------------------------------------------------------------------
diff --git a/website/docs/tutorials/how-to-build-an-app.md b/website/docs/tutorials/how-to-build-an-app.md
new file mode 100644
index 0000000..0ad232e
--- /dev/null
+++ b/website/docs/tutorials/how-to-build-an-app.md
@@ -0,0 +1,256 @@
+---
+layout: page
+title: Mahout Samsara In Core
+theme:
+    name: mahout2
+---
+# How to create and App using Mahout
+
+This is an example of how to create a simple app using Mahout as a Library. The source is available on Github in the [3-input-cooc project](https://github.com/pferrel/3-input-cooc) with more explanation about what it does (has to do with collaborative filtering). For this tutorial we'll concentrate on the app rather than the data science.
+
+The app reads in three user-item interactions types and creats indicators for them using cooccurrence and cross-cooccurrence. The indicators will be written to text files in a format ready for search engine indexing in search engine based recommender.
+
+## Setup
+In order to build and run the CooccurrenceDriver you need to install the following:
+
+* Install the Java 7 JDK from Oracle. Mac users look here: [Java SE Development Kit 7u72](http://www.oracle.com/technetwork/java/javase/downloads/jdk7-downloads-1880260.html).
+* Install sbt (simple build tool) 0.13.x for [Mac](http://www.scala-sbt.org/release/tutorial/Installing-sbt-on-Mac.html), [Linux](http://www.scala-sbt.org/release/tutorial/Installing-sbt-on-Linux.html) or [manual instalation](http://www.scala-sbt.org/release/tutorial/Manual-Installation.html).
+* Install [Spark 1.1.1](https://spark.apache.org/docs/1.1.1/spark-standalone.html). Don't forget to setup SPARK_HOME
+* Install [Mahout 0.10.0](http://mahout.apache.org/general/downloads.html). Don't forget to setup MAHOUT_HOME and MAHOUT_LOCAL
+
+Why install if you are only using them as a library? Certain binaries and scripts are required by the libraries to get information about the environment like discovering where jars are located.
+
+Spark requires a set of jars on the classpath for the client side part of an app and another set of jars must be passed to the Spark Context for running distributed code. The example should discover all the neccessary classes automatically.
+
+## Application
+Using Mahout as a library in an application will require a little Scala code. Scala has an App trait so we'll create an object, which inherits from ```App```
+
+
+    object CooccurrenceDriver extends App {
+    }
+    
+
+This will look a little different than Java since ```App``` does delayed initialization, which causes the body to be executed when the App is launched, just as in Java you would create a main method.
+
+Before we can execute something on Spark we'll need to create a context. We could use raw Spark calls here but default values are setup for a Mahout context by using the Mahout helper function.
+
+    implicit val mc = mahoutSparkContext(masterUrl = "local", 
+      appName = "CooccurrenceDriver")
+    
+We need to read in three files containing different interaction types. The files will each be read into a Mahout IndexedDataset. This allows us to preserve application-specific user and item IDs throughout the calculations.
+
+For example, here is data/purchase.csv:
+
+    u1,iphone
+    u1,ipad
+    u2,nexus
+    u2,galaxy
+    u3,surface
+    u4,iphone
+    u4,galaxy
+
+Mahout has a helper function that reads the text delimited files  SparkEngine.indexedDatasetDFSReadElements. The function reads single element tuples (user-id,item-id) in a distributed way to create the IndexedDataset. Distributed Row Matrices (DRM) and Vectors are important data types supplied by Mahout and IndexedDataset is like a very lightweight Dataframe in R, it wraps a DRM with HashBiMaps for row and column IDs. 
+
+One important thing to note about this example is that we read in all datasets before we adjust the number of rows in them to match the total number of users in the data. This is so the math works out [(A'A, A'B, A'C)](http://mahout.apache.org/users/algorithms/intro-cooccurrence-spark.html) even if some users took one action but not another there must be the same number of rows in all matrices.
+
+    /**
+     * Read files of element tuples and create IndexedDatasets one per action. These 
+     * share a userID BiMap but have their own itemID BiMaps
+     */
+    def readActions(actionInput: Array[(String, String)]): Array[(String, IndexedDataset)] = {
+      var actions = Array[(String, IndexedDataset)]()
+
+      val userDictionary: BiMap[String, Int] = HashBiMap.create()
+
+      // The first action named in the sequence is the "primary" action and 
+      // begins to fill up the user dictionary
+      for ( actionDescription <- actionInput ) {// grab the path to actions
+        val action: IndexedDataset = SparkEngine.indexedDatasetDFSReadElements(
+          actionDescription._2,
+          schema = DefaultIndexedDatasetElementReadSchema,
+          existingRowIDs = userDictionary)
+        userDictionary.putAll(action.rowIDs)
+        // put the name in the tuple with the indexedDataset
+        actions = actions :+ (actionDescription._1, action) 
+      }
+
+      // After all actions are read in the userDictonary will contain every user seen, 
+      // even if they may not have taken all actions . Now we adjust the row rank of 
+      // all IndxedDataset's to have this number of rows
+      // Note: this is very important or the cooccurrence calc may fail
+      val numUsers = userDictionary.size() // one more than the cardinality
+
+      val resizedNameActionPairs = actions.map { a =>
+        //resize the matrix by, in effect by adding empty rows
+        val resizedMatrix = a._2.create(a._2.matrix, userDictionary, a._2.columnIDs).newRowCardinality(numUsers)
+        (a._1, resizedMatrix) // return the Tuple of (name, IndexedDataset)
+      }
+      resizedNameActionPairs // return the array of Tuples
+    }
+
+
+Now that we have the data read in we can perform the cooccurrence calculation.
+
+    // actions.map creates an array of just the IndeedDatasets
+    val indicatorMatrices = SimilarityAnalysis.cooccurrencesIDSs(
+      actions.map(a => a._2)) 
+
+All we need to do now is write the indicators.
+
+    // zip a pair of arrays into an array of pairs, reattaching the action names
+    val indicatorDescriptions = actions.map(a => a._1).zip(indicatorMatrices)
+    writeIndicators(indicatorDescriptions)
+
+
+The ```writeIndicators``` method uses the default write function ```dfsWrite```.
+
+    /**
+     * Write indicatorMatrices to the output dir in the default format
+     * for indexing by a search engine.
+     */
+    def writeIndicators( indicators: Array[(String, IndexedDataset)]) = {
+      for (indicator <- indicators ) {
+        // create a name based on the type of indicator
+        val indicatorDir = OutputPath + indicator._1
+        indicator._2.dfsWrite(
+          indicatorDir,
+          // Schema tells the writer to omit LLR strengths 
+          // and format for search engine indexing
+          IndexedDatasetWriteBooleanSchema) 
+      }
+    }
+ 
+
+See the Github project for the full source. Now we create a build.sbt to build the example. 
+
+    name := "cooccurrence-driver"
+
+    organization := "com.finderbots"
+
+    version := "0.1"
+
+    scalaVersion := "2.10.4"
+
+    val sparkVersion = "1.1.1"
+
+    libraryDependencies ++= Seq(
+      "log4j" % "log4j" % "1.2.17",
+      // Mahout's Spark code
+      "commons-io" % "commons-io" % "2.4",
+      "org.apache.mahout" % "mahout-math-scala_2.10" % "0.10.0",
+      "org.apache.mahout" % "mahout-spark_2.10" % "0.10.0",
+      "org.apache.mahout" % "mahout-math" % "0.10.0",
+      "org.apache.mahout" % "mahout-hdfs" % "0.10.0",
+      // Google collections, AKA Guava
+      "com.google.guava" % "guava" % "16.0")
+
+    resolvers += "typesafe repo" at " http://repo.typesafe.com/typesafe/releases/"
+
+    resolvers += Resolver.mavenLocal
+
+    packSettings
+
+    packMain := Map(
+      "cooc" -> "CooccurrenceDriver")
+
+
+## Build
+Building the examples from project's root folder:
+
+    $ sbt pack
+
+This will automatically set up some launcher scripts for the driver. To run execute
+
+    $ target/pack/bin/cooc
+    
+The driver will execute in Spark standalone mode and put the data in /path/to/3-input-cooc/data/indicators/*indicator-type*
+
+## Using a Debugger
+To build and run this example in a debugger like IntelliJ IDEA. Install from the IntelliJ site and add the Scala plugin.
+
+Open IDEA and go to the menu File->New->Project from existing sources->SBT->/path/to/3-input-cooc. This will create an IDEA project from ```build.sbt``` in the root directory.
+
+At this point you may create a "Debug Configuration" to run. In the menu choose Run->Edit Configurations. Under "Default" choose "Application". In the dialog hit the elipsis button "..." to the right of "Environment Variables" and fill in your versions of JAVA_HOME, SPARK_HOME, and MAHOUT_HOME. In configuration editor under "Use classpath from" choose root-3-input-cooc module. 
+
+![image](http://mahout.apache.org/images/debug-config.png)
+
+Now choose "Application" in the left pane and hit the plus sign "+". give the config a name and hit the elipsis button to the right of the "Main class" field as shown.
+
+![image](http://mahout.apache.org/images/debug-config-2.png)
+
+
+After setting breakpoints you are now ready to debug the configuration. Go to the Run->Debug... menu and pick your configuration. This will execute using a local standalone instance of Spark.
+
+##The Mahout Shell
+
+For small script-like apps you may wish to use the Mahout shell. It is a Scala REPL type interactive shell built on the Spark shell with Mahout-Samsara extensions.
+
+To make the CooccurrenceDriver.scala into a script make the following changes:
+
+* You won't need the context, since it is created when the shell is launched, comment that line out.
+* Replace the logger.info lines with println
+* Remove the package info since it's not needed, this will produce the file in ```path/to/3-input-cooc/bin/CooccurrenceDriver.mscala```. 
+
+Note the extension ```.mscala``` to indicate we are using Mahout's scala extensions for math, otherwise known as [Mahout-Samsara](http://mahout.apache.org/users/environment/out-of-core-reference.html)
+
+To run the code make sure the output does not exist already
+
+    $ rm -r /path/to/3-input-cooc/data/indicators
+    
+Launch the Mahout + Spark shell:
+
+    $ mahout spark-shell
+    
+You'll see the Mahout splash:
+
+    MAHOUT_LOCAL is set, so we don't add HADOOP_CONF_DIR to classpath.
+
+                         _                 _
+             _ __ ___   __ _| |__   ___  _   _| |_
+            | '_ ` _ \ / _` | '_ \ / _ \| | | | __|
+            | | | | | | (_| | | | | (_) | |_| | |_
+            |_| |_| |_|\__,_|_| |_|\___/ \__,_|\__|  version 0.10.0
+
+      
+    Using Scala version 2.10.4 (Java HotSpot(TM) 64-Bit Server VM, Java 1.7.0_72)
+    Type in expressions to have them evaluated.
+    Type :help for more information.
+    15/04/26 09:30:48 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
+    Created spark context..
+    Mahout distributed context is available as "implicit val sdc".
+    mahout> 
+
+To load the driver type:
+
+    mahout> :load /path/to/3-input-cooc/bin/CooccurrenceDriver.mscala
+    Loading ./bin/CooccurrenceDriver.mscala...
+    import com.google.common.collect.{HashBiMap, BiMap}
+    import org.apache.log4j.Logger
+    import org.apache.mahout.math.cf.SimilarityAnalysis
+    import org.apache.mahout.math.indexeddataset._
+    import org.apache.mahout.sparkbindings._
+    import scala.collection.immutable.HashMap
+    defined module CooccurrenceDriver
+    mahout> 
+
+To run the driver type:
+
+    mahout> CooccurrenceDriver.main(args = Array(""))
+    
+You'll get some stats printed:
+
+    Total number of users for all actions = 5
+    purchase indicator matrix:
+      Number of rows for matrix = 4
+      Number of columns for matrix = 5
+      Number of rows after resize = 5
+    view indicator matrix:
+      Number of rows for matrix = 4
+      Number of columns for matrix = 5
+      Number of rows after resize = 5
+    category indicator matrix:
+      Number of rows for matrix = 5
+      Number of columns for matrix = 7
+      Number of rows after resize = 5
+    
+If you look in ```path/to/3-input-cooc/data/indicators``` you should find folders containing the indicator matrices.

http://git-wip-us.apache.org/repos/asf/mahout/blob/e2549b78/website/docs/tutorials/play-with-shell.md
----------------------------------------------------------------------
diff --git a/website/docs/tutorials/play-with-shell.md b/website/docs/tutorials/play-with-shell.md
new file mode 100644
index 0000000..d193160
--- /dev/null
+++ b/website/docs/tutorials/play-with-shell.md
@@ -0,0 +1,199 @@
+---
+layout: page
+title: Mahout Samsara In Core
+theme:
+    name: mahout2
+---
+# Playing with Mahout's Spark Shell 
+
+This tutorial will show you how to play with Mahout's scala DSL for linear algebra and its Spark shell. **Please keep in mind that this code is still in a very early experimental stage**.
+
+_(Edited for 0.10.2)_
+
+## Intro
+
+We'll use an excerpt of a publicly available [dataset about cereals](http://lib.stat.cmu.edu/DASL/Datafiles/Cereals.html). The dataset tells the protein, fat, carbohydrate and sugars (in milligrams) contained in a set of cereals, as well as a customer rating for the cereals. Our aim for this example is to fit a linear model which infers the customer rating from the ingredients.
+
+
+Name                    | protein | fat | carbo | sugars | rating
+:-----------------------|:--------|:----|:------|:-------|:---------
+Apple Cinnamon Cheerios | 2       | 2   | 10.5  | 10     | 29.509541
+Cap'n'Crunch            | 1       | 2   | 12    | 12     | 18.042851  
+Cocoa Puffs             | 1       | 1   | 12    | 13     | 22.736446
+Froot Loops             | 2       |	1   | 11    | 13     | 32.207582  
+Honey Graham Ohs        | 1       |	2   | 12    | 11     | 21.871292
+Wheaties Honey Gold     | 2       | 1   | 16    |  8     | 36.187559  
+Cheerios                | 6       |	2   | 17    |  1     | 50.764999
+Clusters                | 3       |	2   | 13    |  7     | 40.400208
+Great Grains Pecan      | 3       | 3   | 13    |  4     | 45.811716  
+
+
+## Installing Mahout & Spark on your local machine
+
+We describe how to do a quick toy setup of Spark & Mahout on your local machine, so that you can run this example and play with the shell. 
+
+ 1. Download [Apache Spark 1.6.2](http://d3kbcqa49mib13.cloudfront.net/spark-1.6.2-bin-hadoop2.6.tgz) and unpack the archive file
+ 1. Change to the directory where you unpacked Spark and type ```sbt/sbt assembly``` to build it
+ 1. Create a directory for Mahout somewhere on your machine, change to there and checkout the master branch of Apache Mahout from GitHub ```git clone https://github.com/apache/mahout mahout```
+ 1. Change to the ```mahout``` directory and build mahout using ```mvn -DskipTests clean install```
+ 
+## Starting Mahout's Spark shell
+
+ 1. Goto the directory where you unpacked Spark and type ```sbin/start-all.sh``` to locally start Spark
+ 1. Open a browser, point it to [http://localhost:8080/](http://localhost:8080/) to check whether Spark successfully started. Copy the url of the spark master at the top of the page (it starts with **spark://**)
+ 1. Define the following environment variables: <pre class="codehilite">export MAHOUT_HOME=[directory into which you checked out Mahout]
+export SPARK_HOME=[directory where you unpacked Spark]
+export MASTER=[url of the Spark master]
+</pre>
+ 1. Finally, change to the directory where you unpacked Mahout and type ```bin/mahout spark-shell```, 
+you should see the shell starting and get the prompt ```mahout> ```. Check 
+[FAQ](http://mahout.apache.org/users/sparkbindings/faq.html) for further troubleshooting.
+
+## Implementation
+
+We'll use the shell to interactively play with the data and incrementally implement a simple [linear regression](https://en.wikipedia.org/wiki/Linear_regression) algorithm. Let's first load the dataset. Usually, we wouldn't need Mahout unless we processed a large dataset stored in a distributed filesystem. But for the sake of this example, we'll use our tiny toy dataset and "pretend" it was too big to fit onto a single machine.
+
+*Note: You can incrementally follow the example by copy-and-pasting the code into your running Mahout shell.*
+
+Mahout's linear algebra DSL has an abstraction called *DistributedRowMatrix (DRM)* which models a matrix that is partitioned by rows and stored in the memory of a cluster of machines. We use ```dense()``` to create a dense in-memory matrix from our toy dataset and use ```drmParallelize``` to load it into the cluster, "mimicking" a large, partitioned dataset.
+
+<div class="codehilite"><pre>
+val drmData = drmParallelize(dense(
+  (2, 2, 10.5, 10, 29.509541),  // Apple Cinnamon Cheerios
+  (1, 2, 12,   12, 18.042851),  // Cap'n'Crunch
+  (1, 1, 12,   13, 22.736446),  // Cocoa Puffs
+  (2, 1, 11,   13, 32.207582),  // Froot Loops
+  (1, 2, 12,   11, 21.871292),  // Honey Graham Ohs
+  (2, 1, 16,   8,  36.187559),  // Wheaties Honey Gold
+  (6, 2, 17,   1,  50.764999),  // Cheerios
+  (3, 2, 13,   7,  40.400208),  // Clusters
+  (3, 3, 13,   4,  45.811716)), // Great Grains Pecan
+  numPartitions = 2);
+</pre></div>
+
+Have a look at this matrix. The first four columns represent the ingredients 
+(our features) and the last column (the rating) is the target variable for 
+our regression. [Linear regression](https://en.wikipedia.org/wiki/Linear_regression) 
+assumes that the **target variable** `\(\mathbf{y}\)` is generated by the 
+linear combination of **the feature matrix** `\(\mathbf{X}\)` with the 
+**parameter vector** `\(\boldsymbol{\beta}\)` plus the
+ **noise** `\(\boldsymbol{\varepsilon}\)`, summarized in the formula 
+`\(\mathbf{y}=\mathbf{X}\boldsymbol{\beta}+\boldsymbol{\varepsilon}\)`. 
+Our goal is to find an estimate of the parameter vector 
+`\(\boldsymbol{\beta}\)` that explains the data very well.
+
+As a first step, we extract `\(\mathbf{X}\)` and `\(\mathbf{y}\)` from our data matrix. We get *X* by slicing: we take all rows (denoted by ```::```) and the first four columns, which have the ingredients in milligrams as content. Note that the result is again a DRM. The shell will not execute this code yet, it saves the history of operations and defers the execution until we really access a result. **Mahout's DSL automatically optimizes and parallelizes all operations on DRMs and runs them on Apache Spark.**
+
+<div class="codehilite"><pre>
+val drmX = drmData(::, 0 until 4)
+</pre></div>
+
+Next, we extract the target variable vector *y*, the fifth column of the data matrix. We assume this one fits into our driver machine, so we fetch it into memory using ```collect```:
+
+<div class="codehilite"><pre>
+val y = drmData.collect(::, 4)
+</pre></div>
+
+Now we are ready to think about a mathematical way to estimate the parameter vector *β*. A simple textbook approach is [ordinary least squares (OLS)](https://en.wikipedia.org/wiki/Ordinary_least_squares), which minimizes the sum of residual squares between the true target variable and the prediction of the target variable. In OLS, there is even a closed form expression for estimating `\(\boldsymbol{\beta}\)` as 
+`\(\left(\mathbf{X}^{\top}\mathbf{X}\right)^{-1}\mathbf{X}^{\top}\mathbf{y}\)`.
+
+The first thing which we compute for this is  `\(\mathbf{X}^{\top}\mathbf{X}\)`. The code for doing this in Mahout's scala DSL maps directly to the mathematical formula. The operation ```.t()``` transposes a matrix and analogous to R ```%*%``` denotes matrix multiplication.
+
+<div class="codehilite"><pre>
+val drmXtX = drmX.t %*% drmX
+</pre></div>
+
+The same is true for computing `\(\mathbf{X}^{\top}\mathbf{y}\)`. We can simply type the math in scala expressions into the shell. Here, *X* lives in the cluster, while is *y* in the memory of the driver, and the result is a DRM again.
+<div class="codehilite"><pre>
+val drmXty = drmX.t %*% y
+</pre></div>
+
+We're nearly done. The next step we take is to fetch `\(\mathbf{X}^{\top}\mathbf{X}\)` and 
+`\(\mathbf{X}^{\top}\mathbf{y}\)` into the memory of our driver machine (we are targeting 
+features matrices that are tall and skinny , 
+so we can assume that `\(\mathbf{X}^{\top}\mathbf{X}\)` is small enough 
+to fit in). Then, we provide them to an in-memory solver (Mahout provides 
+the an analog to R's ```solve()``` for that) which computes ```beta```, our 
+OLS estimate of the parameter vector `\(\boldsymbol{\beta}\)`.
+
+<div class="codehilite"><pre>
+val XtX = drmXtX.collect
+val Xty = drmXty.collect(::, 0)
+
+val beta = solve(XtX, Xty)
+</pre></div>
+
+That's it! We have a implemented a distributed linear regression algorithm 
+on Apache Spark. I hope you agree that we didn't have to worry a lot about 
+parallelization and distributed systems. The goal of Mahout's linear algebra 
+DSL is to abstract away the ugliness of programming a distributed system 
+as much as possible, while still retaining decent performance and 
+scalability.
+
+We can now check how well our model fits its training data. 
+First, we multiply the feature matrix `\(\mathbf{X}\)` by our estimate of 
+`\(\boldsymbol{\beta}\)`. Then, we look at the difference (via L2-norm) of 
+the target variable `\(\mathbf{y}\)` to the fitted target variable:
+
+<div class="codehilite"><pre>
+val yFitted = (drmX %*% beta).collect(::, 0)
+(y - yFitted).norm(2)
+</pre></div>
+
+We hope that we could show that Mahout's shell allows people to interactively and incrementally write algorithms. We have entered a lot of individual commands, one-by-one, until we got the desired results. We can now refactor a little by wrapping our statements into easy-to-use functions. The definition of functions follows standard scala syntax. 
+
+We put all the commands for ordinary least squares into a function ```ols```. 
+
+<div class="codehilite"><pre>
+def ols(drmX: DrmLike[Int], y: Vector) = 
+  solve(drmX.t %*% drmX, drmX.t %*% y)(::, 0)
+
+</pre></div>
+
+Note that DSL declares implicit `collect` if coersion rules require an in-core argument. Hence, we can simply
+skip explicit `collect`s. 
+
+Next, we define a function ```goodnessOfFit``` that tells how well a model fits the target variable:
+
+<div class="codehilite"><pre>
+def goodnessOfFit(drmX: DrmLike[Int], beta: Vector, y: Vector) = {
+  val fittedY = (drmX %*% beta).collect(::, 0)
+  (y - fittedY).norm(2)
+}
+</pre></div>
+
+So far we have left out an important aspect of a standard linear regression 
+model. Usually there is a constant bias term added to the model. Without 
+that, our model always crosses through the origin and we only learn the 
+right angle. An easy way to add such a bias term to our model is to add a 
+column of ones to the feature matrix `\(\mathbf{X}\)`. 
+The corresponding weight in the parameter vector will then be the bias term.
+
+Here is how we add a bias column:
+
+<div class="codehilite"><pre>
+val drmXwithBiasColumn = drmX cbind 1
+</pre></div>
+
+Now we can give the newly created DRM ```drmXwithBiasColumn``` to our model fitting method ```ols``` and see how well the resulting model fits the training data with ```goodnessOfFit```. You should see a large improvement in the result.
+
+<div class="codehilite"><pre>
+val betaWithBiasTerm = ols(drmXwithBiasColumn, y)
+goodnessOfFit(drmXwithBiasColumn, betaWithBiasTerm, y)
+</pre></div>
+
+As a further optimization, we can make use of the DSL's caching functionality. We use ```drmXwithBiasColumn``` repeatedly  as input to a computation, so it might be beneficial to cache it in memory. This is achieved by calling ```checkpoint()```. In the end, we remove it from the cache with uncache:
+
+<div class="codehilite"><pre>
+val cachedDrmX = drmXwithBiasColumn.checkpoint()
+
+val betaWithBiasTerm = ols(cachedDrmX, y)
+val goodness = goodnessOfFit(cachedDrmX, betaWithBiasTerm, y)
+
+cachedDrmX.uncache()
+
+goodness
+</pre></div>
+
+
+Liked what you saw? Checkout Mahout's overview for the [Scala and Spark bindings](https://mahout.apache.org/users/sparkbindings/home.html).
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/mahout/blob/e2549b78/website/downloads.md
----------------------------------------------------------------------
diff --git a/website/downloads.md b/website/downloads.md
deleted file mode 100644
index 8f33466..0000000
--- a/website/downloads.md
+++ /dev/null
@@ -1,67 +0,0 @@
----
-layout: default
-title: Downloads
-theme: mahout
----
-
-<a name="Downloads-OfficialRelease"></a>
-# Official Release
-Apache Mahout is an official Apache project and thus available from any of
-the Apache mirrors. The latest Mahout release is available for download at: 
-
-* [Download Latest](http://www.apache.org/dyn/closer.cgi/mahout/)
-* [Release Archive](http://archive.apache.org/dist/mahout/)
-
-
-# Source code for the current snapshot
-
-Apache Mahout is mirrored to [Github](https://github.com/apache/mahout). To get all source:
-
-    git clone https://github.com/apache/mahout.git mahout
-   
-# Environment
-
-Whether you are using Mahout's Shell, running command line jobs or using it as a library to build your own apps 
-you'll need to setup several environment variables. 
-Edit your environment in ```~/.bash_profile``` for Mac or ```~/.bashrc``` for many linux distributions. Add the following
-
-    export MAHOUT_HOME=/path/to/mahout
-    export MAHOUT_LOCAL=true # for running standalone on your dev machine, 
-    # unset MAHOUT_LOCAL for running on a cluster 
-
-If you are running on Spark you will also need $SPARK_HOME
-
-Make sure to have $JAVA_HOME set also
-
-# Using Mahout as a Library
-
-Running any application that uses Mahout will require installing a binary or source version and setting the environment.  
-Then add the appropriate setting to your pom.xml or build.sbt following the template below.
- 
-If you only need the math part of Mahout:
-
-    <dependency>
-        <groupId>org.apache.mahout</groupId>
-        <artifactId>mahout-math</artifactId>
-        <version>${mahout.version}</version>
-    </dependency>
-
-In case you would like to use some of our integration tooling (e.g. for generating vectors from Lucene):
-
-    <dependency>
-        <groupId>org.apache.mahout</groupId>
-        <artifactId>mahout-hdfs</artifactId>
-        <version>${mahout.version}</version>
-    </dependency>
-
-In case you are using Ivy, Gradle, Buildr, Grape or SBT you might want to directly head over to the official [Maven Repository search](http://mvnrepository.com/artifact/org.apache.mahout/mahout-core).
-
-
-<a name="Downloads-FutureReleases"></a>
-# Future Releases
-
-Official releases are usually created when the developers feel there are
-sufficient changes, improvements and bug fixes to warrant a release. Watch
-the <a href="https://mahout.apache.org/general/mailing-lists,-irc-and-archives.html">Mailing lists</a>
- for latest release discussions and check the Github repo.
-

http://git-wip-us.apache.org/repos/asf/mahout/blob/e2549b78/website/feed.xml
----------------------------------------------------------------------
diff --git a/website/feed.xml b/website/feed.xml
deleted file mode 100644
index 449315f..0000000
--- a/website/feed.xml
+++ /dev/null
@@ -1,29 +0,0 @@
----
----
-<?xml version="1.0" encoding="UTF-8"?>
-<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
-  <channel>
-    <title>{{ site.data.global.title | xml_escape }}</title>
-    <description>{{ site.data.global.description | xml_escape }}</description>
-    <link>{{ site.data.global.url }}{{ site.baseurl }}/</link>
-    <atom:link href="{{ "/feed.xml" | prepend: site.baseurl | prepend: site.data.global.url }}" rel="self" type="application/rss+xml"/>
-    <pubDate>{{ site.time | date_to_rfc822 }}</pubDate>
-    <lastBuildDate>{{ site.time | date_to_rfc822 }}</lastBuildDate>
-    <generator>Jekyll v{{ jekyll.version }}</generator>
-    {% for post in site.posts limit:10 %}
-      <item>
-        <title>{{ post.title | xml_escape }}</title>
-        <description>{{ post.content | xml_escape }}</description>
-        <pubDate>{{ post.date | date_to_rfc822 }}</pubDate>
-        <link>{{ post.url | prepend: site.baseurl | prepend: site.data.global.url }}</link>
-        <guid isPermaLink="true">{{ post.url | prepend: site.baseurl | prepend: site.data.global.url }}</guid>
-        {% for tag in post.tags %}
-        <category>{{ tag | xml_escape }}</category>
-        {% endfor %}
-        {% for cat in post.categories %}
-        <category>{{ cat | xml_escape }}</category>
-        {% endfor %}
-      </item>
-    {% endfor %}
-  </channel>
-</rss>

http://git-wip-us.apache.org/repos/asf/mahout/blob/e2549b78/website/front/404.html
----------------------------------------------------------------------
diff --git a/website/front/404.html b/website/front/404.html
new file mode 100755
index 0000000..6904bcd
--- /dev/null
+++ b/website/front/404.html
@@ -0,0 +1 @@
+Sorry this page does not exist =(

http://git-wip-us.apache.org/repos/asf/mahout/blob/e2549b78/website/front/Gemfile
----------------------------------------------------------------------
diff --git a/website/front/Gemfile b/website/front/Gemfile
new file mode 100755
index 0000000..301d29c
--- /dev/null
+++ b/website/front/Gemfile
@@ -0,0 +1,5 @@
+source "https://rubygems.org"
+
+gem "jekyll", "~> 3.1"
+gem "jekyll-sitemap"
+gem "pygments.rb"

http://git-wip-us.apache.org/repos/asf/mahout/blob/e2549b78/website/front/History.markdown
----------------------------------------------------------------------
diff --git a/website/front/History.markdown b/website/front/History.markdown
new file mode 100755
index 0000000..5ef89c1
--- /dev/null
+++ b/website/front/History.markdown
@@ -0,0 +1,16 @@
+## HEAD
+
+### Major Enhancements
+
+### Minor Enahncements
+  * Add `drafts` folder support (#167)
+  * Add `excerpt` support (#168)
+  * Create History.markdown to help project management (#169)
+
+### Bug Fixes
+
+### Site Enhancements
+
+### Compatibility updates
+  * Update `preview` task
+

http://git-wip-us.apache.org/repos/asf/mahout/blob/e2549b78/website/front/LICENSE
----------------------------------------------------------------------
diff --git a/website/front/LICENSE b/website/front/LICENSE
new file mode 100755
index 0000000..01a0839
--- /dev/null
+++ b/website/front/LICENSE
@@ -0,0 +1,21 @@
+The MIT License (MIT)
+
+Copyright (c) 2015 Jade Dominguez
+
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.

http://git-wip-us.apache.org/repos/asf/mahout/blob/e2549b78/website/front/README.md
----------------------------------------------------------------------
diff --git a/website/front/README.md b/website/front/README.md
new file mode 100755
index 0000000..62fcfca
--- /dev/null
+++ b/website/front/README.md
@@ -0,0 +1,78 @@
+# Jekyll-Bootstrap
+
+The quickest way to start and publish your Jekyll powered blog. 100% compatible with GitHub pages
+
+## Usage
+
+For all usage and documentation please see: <http://jekyllbootstrap.com>
+
+## Version
+
+0.3.0 - stable and versioned using [semantic versioning](http://semver.org/).
+
+**NOTE:** 0.3.0 introduces a new theme which is not backwards compatible in the sense it won't _look_ like the old version.
+However, the actual API has not changed at all.
+You might want to run 0.3.0 in a branch to make sure you are ok with the theme design changes.
+
+## Milestones
+
+[0.4.0](https://github.com/plusjade/jekyll-bootstrap/milestones/v%200.4.0) - next release [ETA 03/29/2015]
+
+### GOALS
+
+* No open PRs against master branch.
+* Squash some bugs.
+* Add some new features (low-hanging fruit).
+* Establish social media presence.
+
+
+### Bugs
+
+|Bug |Description
+|------|---------------
+|[#86](https://github.com/plusjade/jekyll-bootstrap/issues/86)  |&#x2611; Facebook Comments
+|[#113](https://github.com/plusjade/jekyll-bootstrap/issues/113)|&#x2611; ASSET_PATH w/ page & post
+|[#144](https://github.com/plusjade/jekyll-bootstrap/issues/144)|&#x2610; BASE_PATH w/ FQDN
+|[#227](https://github.com/plusjade/jekyll-bootstrap/issues/227)|&#x2611; Redundant JB/setup
+
+### Features
+
+|Bug |Description
+|------|---------------
+|[#98](https://github.com/plusjade/jekyll-bootstrap/issues/98)  |&#x2611; GIST Integration
+|[#244](https://github.com/plusjade/jekyll-bootstrap/issues/244)|&#x2611; JB/file_exists Helper
+|[#42](https://github.com/plusjade/jekyll-bootstrap/issues/42)  |&#x2611; Sort collections of Pages / Posts
+|[#84](https://github.com/plusjade/jekyll-bootstrap/issues/84)  |&#x2610; Detecting production mode
+
+### TODOS
+
+Review existing pull requests against plusjake/jekyll-bootstrap:master. Merge or close each.
+
+* Create twitter account. Add link / icon on jekyllbootstrap.com.
+* Create blog posts under plusjade/gh-pages, expose on jekyllbootstrap.com, feed to twitter account.
+* Announce state of project, announce roadmap(s), announce new versions as they’re released.
+
+## Contributing
+
+
+To contribute to the framework please make sure to checkout your branch based on `jb-development`!!
+This is very important as it allows me to accept your pull request without having to publish a public version release.
+
+Small, atomic Features, bugs, etc.
+Use the `jb-development` branch but note it will likely change fast as pull requests are accepted.
+Please rebase as often as possible when working.
+Work on small, atomic features/bugs to avoid upstream commits affecting/breaking your development work.
+
+For Big Features or major API extensions/edits:
+This is the one case where I'll accept pull-requests based off the master branch.
+This allows you to work in isolation but it means I'll have to manually merge your work into the next public release.
+Translation : it might take a bit longer so please be patient! (but sincerely thank you).
+
+**Jekyll-Bootstrap Documentation Website.**
+
+The documentation website at <http://jekyllbootstrap.com> is maintained at https://github.com/plusjade/jekyllbootstrap.com
+
+
+## License
+
+[MIT](http://opensource.org/licenses/MIT)

http://git-wip-us.apache.org/repos/asf/mahout/blob/e2549b78/website/front/Rakefile
----------------------------------------------------------------------
diff --git a/website/front/Rakefile b/website/front/Rakefile
new file mode 100755
index 0000000..183ca1e
--- /dev/null
+++ b/website/front/Rakefile
@@ -0,0 +1,306 @@
+require "rubygems"
+require 'rake'
+require 'yaml'
+require 'time'
+
+SOURCE = "."
+CONFIG = {
+  'version' => "0.3.0",
+  'themes' => File.join(SOURCE, "_includes", "themes"),
+  'layouts' => File.join(SOURCE, "_layouts"),
+  'posts' => File.join(SOURCE, "_posts"),
+  'post_ext' => "md",
+  'theme_package_version' => "0.1.0"
+}
+
+# Path configuration helper
+module JB
+  class Path
+    SOURCE = "."
+    Paths = {
+      :layouts => "_layouts",
+      :themes => "_includes/themes",
+      :theme_assets => "assets/themes",
+      :theme_packages => "_theme_packages",
+      :posts => "_posts"
+    }
+    
+    def self.base
+      SOURCE
+    end
+
+    # build a path relative to configured path settings.
+    def self.build(path, opts = {})
+      opts[:root] ||= SOURCE
+      path = "#{opts[:root]}/#{Paths[path.to_sym]}/#{opts[:node]}".split("/")
+      path.compact!
+      File.__send__ :join, path
+    end
+  
+  end #Path
+end #JB
+
+# Usage: rake post title="A Title" [date="2012-02-09"] [tags=[tag1,tag2]] [category="category"]
+desc "Begin a new post in #{CONFIG['posts']}"
+task :post do
+  abort("rake aborted: '#{CONFIG['posts']}' directory not found.") unless FileTest.directory?(CONFIG['posts'])
+  title = ENV["title"] || "new-post"
+  tags = ENV["tags"] || "[]"
+  category = ENV["category"] || ""
+  category = "\"#{category.gsub(/-/,' ')}\"" if !category.empty?
+  slug = title.downcase.strip.gsub(' ', '-').gsub(/[^\w-]/, '')
+  begin
+    date = (ENV['date'] ? Time.parse(ENV['date']) : Time.now).strftime('%Y-%m-%d')
+  rescue => e
+    puts "Error - date format must be YYYY-MM-DD, please check you typed it correctly!"
+    exit -1
+  end
+  filename = File.join(CONFIG['posts'], "#{date}-#{slug}.#{CONFIG['post_ext']}")
+  if File.exist?(filename)
+    abort("rake aborted!") if ask("#{filename} already exists. Do you want to overwrite?", ['y', 'n']) == 'n'
+  end
+  
+  puts "Creating new post: #{filename}"
+  open(filename, 'w') do |post|
+    post.puts "---"
+    post.puts "layout: post"
+    post.puts "title: \"#{title.gsub(/-/,' ')}\""
+    post.puts 'description: ""'
+    post.puts "category: #{category}"
+    post.puts "tags: #{tags}"
+    post.puts "---"
+    post.puts "{% include JB/setup %}"
+  end
+end # task :post
+
+# Usage: rake page name="about.html"
+# You can also specify a sub-directory path.
+# If you don't specify a file extention we create an index.html at the path specified
+desc "Create a new page."
+task :page do
+  name = ENV["name"] || "new-page.md"
+  filename = File.join(SOURCE, "#{name}")
+  filename = File.join(filename, "index.html") if File.extname(filename) == ""
+  title = File.basename(filename, File.extname(filename)).gsub(/[\W\_]/, " ").gsub(/\b\w/){$&.upcase}
+  if File.exist?(filename)
+    abort("rake aborted!") if ask("#{filename} already exists. Do you want to overwrite?", ['y', 'n']) == 'n'
+  end
+  
+  mkdir_p File.dirname(filename)
+  puts "Creating new page: #{filename}"
+  open(filename, 'w') do |post|
+    post.puts "---"
+    post.puts "layout: page"
+    post.puts "title: \"#{title}\""
+    post.puts 'description: ""'
+    post.puts "---"
+    post.puts "{% include JB/setup %}"
+  end
+end # task :page
+
+desc "Launch preview environment"
+task :preview do
+  system "jekyll serve -w"
+end # task :preview
+
+# Public: Alias - Maintains backwards compatability for theme switching.
+task :switch_theme => "theme:switch"
+
+namespace :theme do
+  
+  # Public: Switch from one theme to another for your blog.
+  #
+  # name - String, Required. name of the theme you want to switch to.
+  #        The theme must be installed into your JB framework.
+  #
+  # Examples
+  #
+  #   rake theme:switch name="the-program"
+  #
+  # Returns Success/failure messages.
+  desc "Switch between Jekyll-bootstrap themes."
+  task :switch do
+    theme_name = ENV["name"].to_s
+    theme_path = File.join(CONFIG['themes'], theme_name)
+    settings_file = File.join(theme_path, "settings.yml")
+    non_layout_files = ["settings.yml"]
+
+    abort("rake aborted: name cannot be blank") if theme_name.empty?
+    abort("rake aborted: '#{theme_path}' directory not found.") unless FileTest.directory?(theme_path)
+    abort("rake aborted: '#{CONFIG['layouts']}' directory not found.") unless FileTest.directory?(CONFIG['layouts'])
+
+    Dir.glob("#{theme_path}/*") do |filename|
+      next if non_layout_files.include?(File.basename(filename).downcase)
+      puts "Generating '#{theme_name}' layout: #{File.basename(filename)}"
+
+      open(File.join(CONFIG['layouts'], File.basename(filename)), 'w') do |page|
+        page.puts "---"
+        page.puts File.read(settings_file) if File.exist?(settings_file)
+        page.puts "layout: default" unless File.basename(filename, ".html").downcase == "default"
+        page.puts "---"
+        page.puts "{% include JB/setup %}"
+        page.puts "{% include themes/#{theme_name}/#{File.basename(filename)} %}" 
+      end
+    end
+    
+    puts "=> Theme successfully switched!"
+    puts "=> Reload your web-page to check it out =)"
+  end # task :switch
+  
+  # Public: Install a theme using the theme packager.
+  # Version 0.1.0 simple 1:1 file matching.
+  #
+  # git  - String, Optional path to the git repository of the theme to be installed.
+  # name - String, Optional name of the theme you want to install.
+  #        Passing name requires that the theme package already exist.
+  #
+  # Examples
+  #
+  #   rake theme:install git="https://github.com/jekyllbootstrap/theme-twitter.git"
+  #   rake theme:install name="cool-theme"
+  #
+  # Returns Success/failure messages.
+  desc "Install theme"
+  task :install do
+    if ENV["git"]
+      manifest = theme_from_git_url(ENV["git"])
+      name = manifest["name"]
+    else
+      name = ENV["name"].to_s.downcase
+    end
+
+    packaged_theme_path = JB::Path.build(:theme_packages, :node => name)
+    
+    abort("rake aborted!
+      => ERROR: 'name' cannot be blank") if name.empty?
+    abort("rake aborted! 
+      => ERROR: '#{packaged_theme_path}' directory not found.
+      => Installable themes can be added via git. You can find some here: http://github.com/jekyllbootstrap
+      => To download+install run: `rake theme:install git='[PUBLIC-CLONE-URL]'`
+      => example : rake theme:install git='git@github.com:jekyllbootstrap/theme-the-program.git'
+    ") unless FileTest.directory?(packaged_theme_path)
+    
+    manifest = verify_manifest(packaged_theme_path)
+    
+    # Get relative paths to packaged theme files
+    # Exclude directories as they'll be recursively created. Exclude meta-data files.
+    packaged_theme_files = []
+    FileUtils.cd(packaged_theme_path) {
+      Dir.glob("**/*.*") { |f| 
+        next if ( FileTest.directory?(f) || f =~ /^(manifest|readme|packager)/i )
+        packaged_theme_files << f 
+      }
+    }
+    
+    # Mirror each file into the framework making sure to prompt if already exists.
+    packaged_theme_files.each do |filename|
+      file_install_path = File.join(JB::Path.base, filename)
+      if File.exist? file_install_path and ask("#{file_install_path} already exists. Do you want to overwrite?", ['y', 'n']) == 'n'
+        next
+      else
+        mkdir_p File.dirname(file_install_path)
+        cp_r File.join(packaged_theme_path, filename), file_install_path
+      end
+    end
+    
+    puts "=> #{name} theme has been installed!"
+    puts "=> ---"
+    if ask("=> Want to switch themes now?", ['y', 'n']) == 'y'
+      system("rake switch_theme name='#{name}'")
+    end
+  end
+
+  # Public: Package a theme using the theme packager.
+  # The theme must be structured using valid JB API.
+  # In other words packaging is essentially the reverse of installing.
+  #
+  # name - String, Required name of the theme you want to package.
+  #        
+  # Examples
+  #
+  #   rake theme:package name="twitter"
+  #
+  # Returns Success/failure messages.
+  desc "Package theme"
+  task :package do
+    name = ENV["name"].to_s.downcase
+    theme_path = JB::Path.build(:themes, :node => name)
+    asset_path = JB::Path.build(:theme_assets, :node => name)
+
+    abort("rake aborted: name cannot be blank") if name.empty?
+    abort("rake aborted: '#{theme_path}' directory not found.") unless FileTest.directory?(theme_path)
+    abort("rake aborted: '#{asset_path}' directory not found.") unless FileTest.directory?(asset_path)
+    
+    ## Mirror theme's template directory (_includes)
+    packaged_theme_path = JB::Path.build(:themes, :root => JB::Path.build(:theme_packages, :node => name))
+    mkdir_p packaged_theme_path
+    cp_r theme_path, packaged_theme_path
+    
+    ## Mirror theme's asset directory
+    packaged_theme_assets_path = JB::Path.build(:theme_assets, :root => JB::Path.build(:theme_packages, :node => name))
+    mkdir_p packaged_theme_assets_path
+    cp_r asset_path, packaged_theme_assets_path
+
+    ## Log packager version
+    packager = {"packager" => {"version" => CONFIG["theme_package_version"].to_s } }
+    open(JB::Path.build(:theme_packages, :node => "#{name}/packager.yml"), "w") do |page|
+      page.puts packager.to_yaml
+    end
+    
+    puts "=> '#{name}' theme is packaged and available at: #{JB::Path.build(:theme_packages, :node => name)}"
+  end
+  
+end # end namespace :theme
+
+# Internal: Download and process a theme from a git url.
+# Notice we don't know the name of the theme until we look it up in the manifest.
+# So we'll have to change the folder name once we get the name.
+#
+# url - String, Required url to git repository.
+#        
+# Returns theme manifest hash
+def theme_from_git_url(url)
+  tmp_path = JB::Path.build(:theme_packages, :node => "_tmp")
+  abort("rake aborted: system call to git clone failed") if !system("git clone #{url} #{tmp_path}")
+  manifest = verify_manifest(tmp_path)
+  new_path = JB::Path.build(:theme_packages, :node => manifest["name"])
+  if File.exist?(new_path) && ask("=> #{new_path} theme package already exists. Override?", ['y', 'n']) == 'n'
+    remove_dir(tmp_path)
+    abort("rake aborted: '#{manifest["name"]}' already exists as theme package.")
+  end
+
+  remove_dir(new_path) if File.exist?(new_path)
+  mv(tmp_path, new_path)
+  manifest
+end
+
+# Internal: Process theme package manifest file.
+#
+# theme_path - String, Required. File path to theme package.
+#        
+# Returns theme manifest hash
+def verify_manifest(theme_path)
+  manifest_path = File.join(theme_path, "manifest.yml")
+  manifest_file = File.open( manifest_path )
+  abort("rake aborted: repo must contain valid manifest.yml") unless File.exist? manifest_file
+  manifest = YAML.load( manifest_file )
+  manifest_file.close
+  manifest
+end
+
+def ask(message, valid_options)
+  if valid_options
+    answer = get_stdin("#{message} #{valid_options.to_s.gsub(/"/, '').gsub(/, /,'/')} ") while !valid_options.include?(answer)
+  else
+    answer = get_stdin(message)
+  end
+  answer
+end
+
+def get_stdin(message)
+  print message
+  STDIN.gets.chomp
+end
+
+#Load custom rake scripts
+Dir['_rake/*.rake'].each { |r| load r }

http://git-wip-us.apache.org/repos/asf/mahout/blob/e2549b78/website/front/_config.yml
----------------------------------------------------------------------
diff --git a/website/front/_config.yml b/website/front/_config.yml
new file mode 100644
index 0000000..93846b5
--- /dev/null
+++ b/website/front/_config.yml
@@ -0,0 +1,131 @@
+# This is the default format.
+# For more see: http://jekyllrb.com/docs/permalinks/
+permalink: /:categories/:year/:month/:day/:title
+
+exclude: [".rvmrc", ".rbenv-version", "README.md", "Rakefile", "changelog.md", "vendor", "node_modules", "scss"]
+#pygments: true
+highlighter: rouge
+markdown: kramdown
+redcarpet:
+  extensions: ["tables"]
+encoding: utf-8
+
+sass:
+    sass_dir: _sass
+
+# Themes are encouraged to use these universal variables
+# so be sure to set them if your theme uses them.
+#
+title : Apache Mahout
+tagline: Distributed Linear Algebra
+author :
+  name : The Apache Software Foundation
+  email : dev@mahout.apache.org
+  github : apache
+  twitter : ASF
+  feedburner : feedname
+
+# Serving
+detach:  false
+port:    4000
+host:    127.0.0.1
+baseurl: "" # does not include hostname
+
+MAHOUT_VERSION : 0.13.1
+
+# The production_url is only used when full-domain names are needed
+# such as sitemap.txt
+# Most places will/should use BASE_PATH to make the urls
+#
+# If you have set a CNAME (pages.github.com) set your custom domain here.
+# Else if you are pushing to username.github.io, replace with your username.
+# Finally if you are pushing to a GitHub project page, include the project name at the end.
+#
+production_url : http://mahout.apache.org/
+# All Jekyll-Bootstrap specific configurations are namespaced into this hash
+#
+JB :
+  version : 0.3.0
+
+  # All links will be namespaced by BASE_PATH if defined.
+  # Links in your website should always be prefixed with {{BASE_PATH}}
+  # however this value will be dynamically changed depending on your deployment situation.
+  #
+  # CNAME (http://yourcustomdomain.com)
+  #   DO NOT SET BASE_PATH
+  #   (urls will be prefixed with "/" and work relatively)
+  #
+  # GitHub Pages (http://username.github.io)
+  #   DO NOT SET BASE_PATH
+  #   (urls will be prefixed with "/" and work relatively)
+  #
+  # GitHub Project Pages (http://username.github.io/project-name)
+  #
+  #   A GitHub Project site exists in the `gh-pages` branch of one of your repositories.
+  #  REQUIRED! Set BASE_PATH to: http://username.github.io/project-name
+  #
+  # CAUTION:
+  #   - When in Localhost, your site will run from root "/" regardless of BASE_PATH
+  #   - Only the following values are falsy: ["", null, false]
+  #   - When setting BASE_PATH it must be a valid url.
+  #     This means always setting the protocol (http|https) or prefixing with "/"
+  BASE_PATH : "/"
+
+  # By default, the asset_path is automatically defined relative to BASE_PATH plus the enabled theme.
+  # ex: [BASE_PATH]/assets/themes/[THEME-NAME]
+  #
+  # Override this by defining an absolute path to assets here.
+  # ex:
+  #   http://s3.amazonaws.com/yoursite/themes/watermelon
+  #   /assets
+  #
+  ASSET_PATH : false
+
+  # These paths are to the main pages Jekyll-Bootstrap ships with.
+  # Some JB helpers refer to these paths; change them here if needed.
+  #
+  archive_path: /archive.html
+  categories_path : /categories.html
+  tags_path : /tags.html
+  atom_path : /atom.xml
+  rss_path : /rss.xml
+
+  # Settings for comments helper
+  # Set 'provider' to the comment provider you want to use.
+  # Set 'provider' to false to turn commenting off globally.
+  #
+  comments :
+    provider : disqus
+    disqus :
+      short_name : jekyllbootstrap
+    livefyre :
+      site_id : 123
+    intensedebate :
+      account : 123abc
+    facebook :
+      appid : 123
+      num_posts: 5
+      width: 580
+      colorscheme: light
+
+  # Settings for analytics helper
+  # Set 'provider' to the analytics provider you want to use.
+  # Set 'provider' to false to turn analytics off globally.
+
+  # Settings for sharing helper.
+  # Sharing is for things like tweet, plusone, like, reddit buttons etc.
+  # Set 'provider' to the sharing provider you want to use.
+  # Set 'provider' to false to turn sharing off globally.
+  #
+  sharing :
+    provider : false
+
+  # Settings for all other include helpers can be defined by creating
+  # a hash with key named for the given helper. ex:
+  #
+  #   pages_list :
+  #     provider : "custom"
+  #
+  # Setting any helper's provider to 'custom' will bypass the helper code
+  # and include your custom code. Your custom file must be defined at:
+  #   ./_includes/custom/[HELPER]
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/mahout/blob/e2549b78/website/front/_includes/JB/analytics
----------------------------------------------------------------------
diff --git a/website/front/_includes/JB/analytics b/website/front/_includes/JB/analytics
new file mode 100755
index 0000000..2bb4c80
--- /dev/null
+++ b/website/front/_includes/JB/analytics
@@ -0,0 +1,20 @@
+{% include JB/is_production %}
+
+{% if is_production and site.JB.analytics.provider and page.JB.analytics != false %}
+
+{% case site.JB.analytics.provider %}
+{% when "gauges" %}
+  {% include JB/analytics-providers/gauges %}
+{% when "google" %}
+  {% include JB/analytics-providers/google %}
+{% when "getclicky" %}
+  {% include JB/analytics-providers/getclicky %}
+{% when "mixpanel" %}
+  {% include JB/analytics-providers/mixpanel %}
+{% when "piwik" %}
+  {% include JB/analytics-providers/piwik %}
+{% when "custom" %}
+  {% include custom/analytics %}
+{% endcase %}
+
+{% endif %}

http://git-wip-us.apache.org/repos/asf/mahout/blob/e2549b78/website/front/_includes/JB/analytics-providers/gauges
----------------------------------------------------------------------
diff --git a/website/front/_includes/JB/analytics-providers/gauges b/website/front/_includes/JB/analytics-providers/gauges
new file mode 100755
index 0000000..b793ff1
--- /dev/null
+++ b/website/front/_includes/JB/analytics-providers/gauges
@@ -0,0 +1,13 @@
+<script type="text/javascript">
+  var _gauges = _gauges || [];
+  (function() {
+    var t   = document.createElement('script');
+    t.type  = 'text/javascript';
+    t.async = true;
+    t.id    = 'gauges-tracker';
+    t.setAttribute('data-site-id', '{{ site.JB.analytics.gauges.site_id }}');
+    t.src = '//secure.gaug.es/track.js';
+    var s = document.getElementsByTagName('script')[0];
+    s.parentNode.insertBefore(t, s);
+  })();
+</script>

http://git-wip-us.apache.org/repos/asf/mahout/blob/e2549b78/website/front/_includes/JB/analytics-providers/getclicky
----------------------------------------------------------------------
diff --git a/website/front/_includes/JB/analytics-providers/getclicky b/website/front/_includes/JB/analytics-providers/getclicky
new file mode 100755
index 0000000..e9462f4
--- /dev/null
+++ b/website/front/_includes/JB/analytics-providers/getclicky
@@ -0,0 +1,12 @@
+<script type="text/javascript">
+var clicky_site_ids = clicky_site_ids || [];
+clicky_site_ids.push({{ site.JB.analytics.getclicky.site_id }});
+(function() {
+  var s = document.createElement('script');
+  s.type = 'text/javascript';
+  s.async = true;
+  s.src = '//static.getclicky.com/js';
+  ( document.getElementsByTagName('head')[0] || document.getElementsByTagName('body')[0] ).appendChild( s );
+})();
+</script>
+<noscript><p><img alt="Clicky" width="1" height="1" src="//in.getclicky.com/{{ site.JB.analytics.getclicky.site_id }}ns.gif" /></p></noscript>

http://git-wip-us.apache.org/repos/asf/mahout/blob/e2549b78/website/front/_includes/JB/analytics-providers/google
----------------------------------------------------------------------
diff --git a/website/front/_includes/JB/analytics-providers/google b/website/front/_includes/JB/analytics-providers/google
new file mode 100755
index 0000000..9014866
--- /dev/null
+++ b/website/front/_includes/JB/analytics-providers/google
@@ -0,0 +1,11 @@
+<script type="text/javascript">
+  var _gaq = _gaq || [];
+  _gaq.push(['_setAccount', '{{ site.JB.analytics.google.tracking_id }}']);
+  _gaq.push(['_trackPageview']);
+
+  (function() {
+    var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
+    ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
+    var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);
+  })();
+</script>
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/mahout/blob/e2549b78/website/front/_includes/JB/analytics-providers/google-universal
----------------------------------------------------------------------
diff --git a/website/front/_includes/JB/analytics-providers/google-universal b/website/front/_includes/JB/analytics-providers/google-universal
new file mode 100755
index 0000000..834f2ee
--- /dev/null
+++ b/website/front/_includes/JB/analytics-providers/google-universal
@@ -0,0 +1,9 @@
+<script>
+  (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
+  (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
+  m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
+  })(window,document,'script','//www.google-analytics.com/analytics.js','ga');
+
+  ga('create', {{ site.JB.analytics.googleUA.tracking_id }}', {% if site.JB.analytics.googleUA.property_name %}{{ site.JB.analytics.googleUA.property_name }}{% else %}'auto'{% endif %});
+  ga('send', 'pageview');
+</script>
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/mahout/blob/e2549b78/website/front/_includes/JB/analytics-providers/mixpanel
----------------------------------------------------------------------
diff --git a/website/front/_includes/JB/analytics-providers/mixpanel b/website/front/_includes/JB/analytics-providers/mixpanel
new file mode 100755
index 0000000..4406eb0
--- /dev/null
+++ b/website/front/_includes/JB/analytics-providers/mixpanel
@@ -0,0 +1,11 @@
+<script type="text/javascript">
+    var mpq = [];
+    mpq.push(["init", "{{ site.JB.analytics.mixpanel.token}}"]);
+    (function(){var b,a,e,d,c;b=document.createElement("script");b.type="text/javascript";
+    b.async=true;b.src=(document.location.protocol==="https:"?"https:":"http:")+
+    "//api.mixpanel.com/site_media/js/api/mixpanel.js";a=document.getElementsByTagName("script")[0];
+    a.parentNode.insertBefore(b,a);e=function(f){return function(){mpq.push(
+    [f].concat(Array.prototype.slice.call(arguments,0)))}};d=["init","track","track_links",
+    "track_forms","register","register_once","identify","name_tag","set_config"];for(c=0;c<
+    d.length;c++){mpq[d[c]]=e(d[c])}})();
+</script>
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/mahout/blob/e2549b78/website/front/_includes/JB/analytics-providers/piwik
----------------------------------------------------------------------
diff --git a/website/front/_includes/JB/analytics-providers/piwik b/website/front/_includes/JB/analytics-providers/piwik
new file mode 100755
index 0000000..077a373
--- /dev/null
+++ b/website/front/_includes/JB/analytics-providers/piwik
@@ -0,0 +1,10 @@
+<script type="text/javascript">
+  var pkBaseURL = (("https:" == document.location.protocol) ? "https://{{ site.JB.analytics.piwik.baseURL }}/" : "http://{{ site.JB.analytics.piwik.baseURL }}/");
+  document.write(unescape("%3Cscript src='" + pkBaseURL + "piwik.js' type='text/javascript'%3E%3C/script%3E"));
+</script><script type="text/javascript">
+  try {
+    var piwikTracker = Piwik.getTracker(pkBaseURL + "piwik.php", {{ site.JB.analytics.piwik.idsite }});
+    piwikTracker.trackPageView();
+    piwikTracker.enableLinkTracking();
+  } catch( err ) {}
+</script><noscript><p><img src="http://{{ site.JB.analytics.piwik.baseURL }}/piwik.php?idsite={{ site.JB.analytics.piwik.idsite }}" style="border:0" alt="" /></p></noscript>
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/mahout/blob/e2549b78/website/front/_includes/JB/categories_list
----------------------------------------------------------------------
diff --git a/website/front/_includes/JB/categories_list b/website/front/_includes/JB/categories_list
new file mode 100755
index 0000000..83be2e2
--- /dev/null
+++ b/website/front/_includes/JB/categories_list
@@ -0,0 +1,37 @@
+{% comment %}<!--
+The categories_list include is a listing helper for categories.
+Usage:
+  1) assign the 'categories_list' variable to a valid array of tags.
+  2) include JB/categories_list
+  example:
+    <ul>
+  	  {% assign categories_list = site.categories %}  
+  	  {% include JB/categories_list %}
+  	</ul>
+  
+  Notes: 
+    Categories can be either a Hash of Category objects (hashes) or an Array of category-names (strings).
+    The encapsulating 'if' statement checks whether categories_list is a Hash or Array.
+    site.categories is a Hash while page.categories is an array.
+    
+  This helper can be seen in use at: ../_layouts/default.html
+-->{% endcomment %}
+
+{% if site.JB.categories_list.provider == "custom" %}
+  {% include custom/categories_list %}
+{% else %}
+  {% if categories_list.first[0] == null %}
+    {% for category in categories_list %} 
+    	<li><a href="{{ BASE_PATH }}{{ site.JB.categories_path }}#{{ category }}-ref">
+    		{{ category | join: "/" }} <span>{{ site.categories[category].size }}</span>
+    	</a></li>
+    {% endfor %}
+  {% else %}
+    {% for category in categories_list %} 
+    	<li><a href="{{ BASE_PATH }}{{ site.JB.categories_path }}#{{ category[0] }}-ref">
+    		{{ category[0] | join: "/" }} <span>{{ category[1].size }}</span>
+    	</a></li>
+    {% endfor %}
+  {% endif %}
+{% endif %}
+{% assign categories_list = nil %}
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/mahout/blob/e2549b78/website/front/_includes/JB/comments
----------------------------------------------------------------------
diff --git a/website/front/_includes/JB/comments b/website/front/_includes/JB/comments
new file mode 100755
index 0000000..eec2e1e
--- /dev/null
+++ b/website/front/_includes/JB/comments
@@ -0,0 +1,18 @@
+{% if site.JB.comments.provider and page.comments != false %}
+
+{% case site.JB.comments.provider %}
+{% when "disqus" %}
+  {% include JB/comments-providers/disqus %}
+{% when "livefyre" %}
+  {% include JB/comments-providers/livefyre %}
+{% when "intensedebate" %}
+  {% include JB/comments-providers/intensedebate %}
+{% when "facebook" %}
+  {% include JB/comments-providers/facebook %}
+{% when "duoshuo" %}
+  {% include JB/comments-providers/duoshuo %}
+{% when "custom" %}
+  {% include custom/comments %}
+{% endcase %}
+
+{% endif %}
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/mahout/blob/e2549b78/website/front/_includes/JB/comments-providers/disqus
----------------------------------------------------------------------
diff --git a/website/front/_includes/JB/comments-providers/disqus b/website/front/_includes/JB/comments-providers/disqus
new file mode 100755
index 0000000..6343100
--- /dev/null
+++ b/website/front/_includes/JB/comments-providers/disqus
@@ -0,0 +1,15 @@
+<div id="disqus_thread"></div>
+<script type="text/javascript">
+    {% include JB/is_production %}
+    {% if is_production == false %}var disqus_developer = 1;{% endif %}
+    var disqus_shortname = '{{ site.JB.comments.disqus.short_name }}'; // required: replace example with your forum shortname
+    {% if page.wordpress_id %}var disqus_identifier = '{{page.wordpress_id}} {{site.production_url}}/?p={{page.wordpress_id}}';{% endif %}
+    /* * * DON'T EDIT BELOW THIS LINE * * */
+    (function() {
+        var dsq = document.createElement('script'); dsq.type = 'text/javascript'; dsq.async = true;
+        dsq.src = 'http://' + disqus_shortname + '.disqus.com/embed.js';
+        (document.getElementsByTagName('head')[0] || document.getElementsByTagName('body')[0]).appendChild(dsq);
+    })();
+</script>
+<noscript>Please enable JavaScript to view the <a href="http://disqus.com/?ref_noscript">comments powered by Disqus.</a></noscript>
+<a href="http://disqus.com" class="dsq-brlink">blog comments powered by <span class="logo-disqus">Disqus</span></a>

http://git-wip-us.apache.org/repos/asf/mahout/blob/e2549b78/website/front/_includes/JB/comments-providers/duoshuo
----------------------------------------------------------------------
diff --git a/website/front/_includes/JB/comments-providers/duoshuo b/website/front/_includes/JB/comments-providers/duoshuo
new file mode 100755
index 0000000..90865a0
--- /dev/null
+++ b/website/front/_includes/JB/comments-providers/duoshuo
@@ -0,0 +1,14 @@
+<!-- Duoshuo Comment BEGIN -->
+  <div class="ds-thread"{% if page.wordpress_id %} data-thread-key="{{page.wordpress_id}}"{% endif %}></div>
+<script type="text/javascript">
+var duoshuoQuery = {short_name:'{{ site.JB.comments.duoshuo.short_name }}'};
+  (function() {
+    var ds = document.createElement('script');
+    ds.type = 'text/javascript';ds.async = true;
+    ds.src = 'http://static.duoshuo.com/embed.js';
+    ds.charset = 'UTF-8';
+    (document.getElementsByTagName('head')[0] 
+    || document.getElementsByTagName('body')[0]).appendChild(ds);
+  })();
+  </script>
+<!-- Duoshuo Comment END -->

http://git-wip-us.apache.org/repos/asf/mahout/blob/e2549b78/website/front/_includes/JB/comments-providers/facebook
----------------------------------------------------------------------
diff --git a/website/front/_includes/JB/comments-providers/facebook b/website/front/_includes/JB/comments-providers/facebook
new file mode 100755
index 0000000..e1d3deb
--- /dev/null
+++ b/website/front/_includes/JB/comments-providers/facebook
@@ -0,0 +1,9 @@
+<div id="fb-root"></div>
+<script>(function(d, s, id) {
+  var js, fjs = d.getElementsByTagName(s)[0];
+  if (d.getElementById(id)) return;
+  js = d.createElement(s); js.id = id;
+  js.src = "//connect.facebook.net/en_US/all.js#xfbml=1&appId={{ site.JB.comments.facebook.appid }}";
+  fjs.parentNode.insertBefore(js, fjs);
+}(document, 'script', 'facebook-jssdk'));</script>
+<div class="fb-comments" data-href="{{ site.production_url }}{{ page.url }}" data-num-posts="{{ site.JB.comments.facebook.num_posts }}" data-width="{{ site.JB.comments.facebook.width }}" data-colorscheme="{{ site.JB.comments.facebook.colorscheme }}"></div>
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/mahout/blob/e2549b78/website/front/_includes/JB/comments-providers/intensedebate
----------------------------------------------------------------------
diff --git a/website/front/_includes/JB/comments-providers/intensedebate b/website/front/_includes/JB/comments-providers/intensedebate
new file mode 100755
index 0000000..233ce34
--- /dev/null
+++ b/website/front/_includes/JB/comments-providers/intensedebate
@@ -0,0 +1,6 @@
+<script>
+var idcomments_acct = '{{ site.JB.comments.intensedebate.account }}';
+var idcomments_post_id;
+var idcomments_post_url;
+</script>
+<script type="text/javascript" src="http://www.intensedebate.com/js/genericCommentWrapperV2.js"></script>

http://git-wip-us.apache.org/repos/asf/mahout/blob/e2549b78/website/front/_includes/JB/comments-providers/livefyre
----------------------------------------------------------------------
diff --git a/website/front/_includes/JB/comments-providers/livefyre b/website/front/_includes/JB/comments-providers/livefyre
new file mode 100755
index 0000000..704b803
--- /dev/null
+++ b/website/front/_includes/JB/comments-providers/livefyre
@@ -0,0 +1,6 @@
+<script type='text/javascript' src='http://zor.livefyre.com/wjs/v1.0/javascripts/livefyre_init.js'></script>
+<script type='text/javascript'>
+    var fyre = LF({
+        site_id: {{ site.JB.comments.livefyre.site_id }}
+    });
+</script>
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/mahout/blob/e2549b78/website/front/_includes/JB/feedburner
----------------------------------------------------------------------
diff --git a/website/front/_includes/JB/feedburner b/website/front/_includes/JB/feedburner
new file mode 100755
index 0000000..6dba603
--- /dev/null
+++ b/website/front/_includes/JB/feedburner
@@ -0,0 +1,3 @@
+{% if site.author.feedburner != null %}
+<link href="http://feeds.feedburner.com/{{ site.author.feedburner }}" rel="alternate" title="{{ site.title }}" type="application/atom+xml" />
+{% endif %}

http://git-wip-us.apache.org/repos/asf/mahout/blob/e2549b78/website/front/_includes/JB/file_exists
----------------------------------------------------------------------
diff --git a/website/front/_includes/JB/file_exists b/website/front/_includes/JB/file_exists
new file mode 100755
index 0000000..f40080f
--- /dev/null
+++ b/website/front/_includes/JB/file_exists
@@ -0,0 +1,26 @@
+{% comment %}<!--
+  param:  file = "/example/file.png"
+  return: file_exists_result = true
+  
+  examples:
+    {% include JB/file_exists file="/404.html" %}
+    {% if file_exists_result %}Found "/404.html"!{% else %}Did not find "/404.html".{% endif %}
+
+    {% assign filename = "/405.html" %}
+    {% include JB/file_exists file=filename %}
+    {% if file_exists_result %}Found "{{ filename }}"!{% else %}Did not find "{{ filename }}".{% endif %}
+
+  NOTE: the BREAK statement in the FOR loop assumes Liquid >= 2.5.0
+  
+-->{% endcomment %}
+
+{% assign file_exists_result = false %}
+
+{% if include.file %}
+	{% for static_file in site.static_files %}
+		{% if static_file.path == include.file %}
+			{% assign file_exists_result = true %}
+			{% break %}
+		{% endif %}
+	{% endfor %}
+{% endif %}

http://git-wip-us.apache.org/repos/asf/mahout/blob/e2549b78/website/front/_includes/JB/gist
----------------------------------------------------------------------
diff --git a/website/front/_includes/JB/gist b/website/front/_includes/JB/gist
new file mode 100755
index 0000000..38a5b1c
--- /dev/null
+++ b/website/front/_includes/JB/gist
@@ -0,0 +1,19 @@
+{% comment %}<!--
+The gist include allows you to embed GitHub Gist snippets in your content.
+Usage:
+  1) include JB/gist
+  2) specify the gist_id parameter (REQUIRED)
+  3) specify the gist_file parameter (OPTIONAL)
+  example:
+    <ul>
+  	  {% include JB/gist gist_id="fdcfeaba4f33c172828d" %}
+  	  {% include JB/gist gist_id="fdcfeaba4f33c172828d" gist_file="jekyll-bootstrap.js" %}
+  	</ul>
+-->{% endcomment %}
+
+<div id="gist">
+<script src="https://gist.github.com/{{ include.gist_id }}.js{% if include.gist_file %}?file={{ include.gist_file }}{% endif %}"></script>
+<noscript>
+<pre>https://gist.github.com/{{include.gist_id}}.js{% if include.gist_file %}?file={{include.gist_file}}{% endif %}</pre>
+</noscript>
+</div>

http://git-wip-us.apache.org/repos/asf/mahout/blob/e2549b78/website/front/_includes/JB/is_production
----------------------------------------------------------------------
diff --git a/website/front/_includes/JB/is_production b/website/front/_includes/JB/is_production
new file mode 100755
index 0000000..3548f8c
--- /dev/null
+++ b/website/front/_includes/JB/is_production
@@ -0,0 +1,39 @@
+{% capture jbcache %}{% comment %}
+
+  Determine whether or not the site is being built in a production environment.
+  
+  Parameters:
+    None.
+
+  Returns:
+    is_production: [true|false]
+    jb_prod_env: [development|github|other]
+
+  Examples:
+  
+    {% include JB/is_production %}
+    
+    {% if is_production != true %}
+      <h3>This is Private</h3>
+      <p>I love to watch television in my undies. Don't tell anyone!</p>
+    {% endif %}
+    
+    <h3>This is Public</h3>
+    <p>I have no unusual quirks.</p>
+
+{% endcomment %}
+
+{% assign is_production = false %}
+{% assign jb_prod_env = "development" %}
+
+{% if jekyll.environment != "development" %}
+  {% assign is_production = true %}
+  {% assign jb_prod_env = jekyll.environment %}
+{% endif %}
+
+{% if site.github %}
+  {% assign is_production = true %}
+  {% assign jb_prod_env = "github" %}
+{% endif %}
+
+{% endcapture %}{% assign jbcache = nil %}
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/mahout/blob/e2549b78/website/front/_includes/JB/liquid_raw
----------------------------------------------------------------------
diff --git a/website/front/_includes/JB/liquid_raw b/website/front/_includes/JB/liquid_raw
new file mode 100755
index 0000000..da2d359
--- /dev/null
+++ b/website/front/_includes/JB/liquid_raw
@@ -0,0 +1,32 @@
+{% comment%}<!--
+The liquid_raw helper is a way to display raw liquid code, as opposed to parsing it.
+Normally you'd use Liquid's built in 'raw' tag. 
+The problem is GitHub Jekyll does not support the current Liquid release.
+GitHub Jekyll supports the deprecated 'literal' tag.
+Using one will break the other if you plan to deploy to GitHub pages.
+  see: https://github.com/mojombo/jekyll/issues/425
+
+Since I don't want to mess with Liquid versions, I'll just rewrite the way I 
+intend to give liquid examples. It's not an elegant solution by any means:
+
+Usage: 
+  1) Define a 'text' variable with the block of liquid code you intend to display.
+  2) Pass the text variable to include JB/liquid_raw
+
+  example:
+  {% capture text %}|.% for tag in tags_list %.|
+    <li><a href="|.{ site.var.tags_path }.||.{ tag[0] }.|-ref">|.{ tag[0] }.| <span>|.{tag[1].size}.|</span></a></li>
+  |.% endfor %.|
+
+  |.% assign tags_list = null %.|{% endcapture %}    
+  {% include JB/liquid_raw %}
+  
+  As seen here, you must use "|." and ".|" as opening and closing brackets.
+-->{% endcomment%}
+
+{% if site.JB.liquid_raw.provider == "custom" %}
+  {% include custom/liquid_raw %}
+{% else %}
+  <pre><code>{{text | replace:"|.", "&#123;" | replace:".|", "&#125;" | replace:">", "&gt;" | replace:"<", "&lt;" }}</code></pre>
+{% endif %}
+{% assign text = nil %}
\ No newline at end of file


Mime
View raw message