hama-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yexi Jiang (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HAMA-681) Multi Layer Perceptron
Date Mon, 27 May 2013 17:25:21 GMT

    [ https://issues.apache.org/jira/browse/HAMA-681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13667871#comment-13667871
] 

Yexi Jiang edited comment on HAMA-681 at 5/27/13 5:25 PM:
----------------------------------------------------------

Hi, all,

I have finished the first version (including the junit test cases) of Multi-layer Perceptron
(MLP) and the code is available at the link (https://github.com/yxjiang/hama/tree/trunk/ml/src/main/java/org/apache/hama/ml/perceptron).
I made a mirror of the original hama repository and create a new java package org.apache.hama.ml.perceptron.

The first version has following characteristics:
1. No extra dependency on libraries/code other than the existing ones in HAMA.
2. Unified interface for small scale MLP and large scale MLP. 
The backend computational mechanism is transparent to the users. The user does not need to
care about the difference between the small scale and large scale MLP.
(The small MLP indicates that the model can be hold in main memory of a single machine, and
the large scale MLP indicates that the model cannot be hold in main memory of one machine.)

2. Implemented the small scale MLP. It allows user to set parameters like the MLP topology,
learning rate, etc. By setting the batch.size parameter to different values, the algorithms
can learn by stochastic gradient descent, mini-batch gradient descent, and traditional gradient
descent.



TO DO LIST:
1. Ameliorate the small scale MLP to support more parameter, including momentum, learning
rate decay, etc.
2. Fix the possible bugs.
3. Increase the performance of small scale MLP.
4. Design the large scale MLP.



For the first version, the user can create and use a MLP with following example code.
-----------------------------------------------------
  /*	Assume the following path contains the training data
   *    The training data is in format of VectorWritable,
   *    where the first m elements contains the training features, 
   *    and the remaining elements is the expected values of output layer. 
   */
  Configuration conf = new Configuration();
  String strDataPath = "hdfs://localhost:9000/tmp/dummy";
  Path dataPath = new Path(strDataPath);
  
		
  //  Specify the parameters
  String modelPath = "sampleModel.data";
  double learningRate = 0.5;
  boolean regularization = false;	//	no regularization
  double momentum = 0;	//	no momentum
  String squashingFunctionName = "Sigmoid";
  String costFunctionName = "SquareError";
  int[] layerSizeArray = new int[]{3, 2, 2, 3};
  MultiLayerPerceptron mlp = new SmallMultiLayerPerceptron(learningRate, regularization, 
				momentum, squashingFunctionName, costFunctionName, layerSizeArray);
  
  //  Specify the training parameters		
  Map<String, String> trainingParams = new HashMap<String, String>();
  trainingParams.put("training.iteration", "10");
  trainingParams.put("training.mode", "minibatch.gradient.descent");
  trainingParams.put("training.batch.size", "200");
  trainingParams.put("tasks", "3");
	
  /*  train the model	*/
  try {
    mlp.train(dataPath, trainingParams);  // train the model
  } catch (Exception e) {
    e.printStackTrace();
  }

  /*  
   *  Use the model by using 'output' method.
   *  The input and output results are both in forms of DoubleVector.
   */
  try {
    DoubleVector testVec = /*... code to get the test vector */
    DoubleVector resultVec = mlp.output(testVec);  
  } catch (Exception e) {
    e.printStackTrace();
  }
-----------------------------------------------------


                
      was (Author: yxjiang):
    Hi, all,

I have finished the first version (including the junit test cases) of Multi-layer Perceptron
(MLP) and the code is available at the link (https://github.com/yxjiang/hama/tree/trunk/ml/src/main/java/org/apache/hama/ml/regression).
I made a mirror of the original hama repository and create a new java package org.apache.hama.ml.perceptron.

The first version has following characteristics:
1. No extra dependency on libraries/code other than the existing ones in HAMA.
2. Unified interface for small scale MLP and large scale MLP. 
The backend computational mechanism is transparent to the users. The user does not need to
care about the difference between the small scale and large scale MLP.
(The small MLP indicates that the model can be hold in main memory of a single machine, and
the large scale MLP indicates that the model cannot be hold in main memory of one machine.)

2. Implemented the small scale MLP. It allows user to set parameters like the MLP topology,
learning rate, etc. By setting the batch.size parameter to different values, the algorithms
can learn by stochastic gradient descent, mini-batch gradient descent, and traditional gradient
descent.



TO DO LIST:
1. Ameliorate the small scale MLP to support more parameter, including momentum, learning
rate decay, etc.
2. Fix the possible bugs.
3. Increase the performance of small scale MLP.
4. Design the large scale MLP.



For the first version, the user can create and use a MLP with following example code.
-----------------------------------------------------
  /*	Assume the following path contains the training data
   *    The training data is in format of VectorWritable,
   *    where the first m elements contains the training features, 
   *    and the remaining elements is the expected values of output layer. 
   */
  Configuration conf = new Configuration();
  String strDataPath = "hdfs://localhost:9000/tmp/dummy";
  Path dataPath = new Path(strDataPath);
  
		
  //  Specify the parameters
  String modelPath = "sampleModel.data";
  double learningRate = 0.5;
  boolean regularization = false;	//	no regularization
  double momentum = 0;	//	no momentum
  String squashingFunctionName = "Sigmoid";
  String costFunctionName = "SquareError";
  int[] layerSizeArray = new int[]{3, 2, 2, 3};
  MultiLayerPerceptron mlp = new SmallMultiLayerPerceptron(learningRate, regularization, 
				momentum, squashingFunctionName, costFunctionName, layerSizeArray);
  
  //  Specify the training parameters		
  Map<String, String> trainingParams = new HashMap<String, String>();
  trainingParams.put("training.iteration", "10");
  trainingParams.put("training.mode", "minibatch.gradient.descent");
  trainingParams.put("training.batch.size", "200");
  trainingParams.put("tasks", "3");
	
  /*  train the model	*/
  try {
    mlp.train(dataPath, trainingParams);  // train the model
  } catch (Exception e) {
    e.printStackTrace();
  }

  /*  
   *  Use the model by using 'output' method.
   *  The input and output results are both in forms of DoubleVector.
   */
  try {
    DoubleVector testVec = /*... code to get the test vector */
    DoubleVector resultVec = mlp.output(testVec);  
  } catch (Exception e) {
    e.printStackTrace();
  }
-----------------------------------------------------


                  
> Multi Layer Perceptron 
> -----------------------
>
>                 Key: HAMA-681
>                 URL: https://issues.apache.org/jira/browse/HAMA-681
>             Project: Hama
>          Issue Type: New Feature
>          Components: machine learning
>            Reporter: Christian Herta
>
> Implementation of a Multilayer Perceptron (Neural Network)
>  - Learning by Backpropagation 
>  - Distributed Learning
> The implementation should be the basis for the long range goals:
>  - more efficent learning (Adagrad, L-BFGS)
>  - High efficient distributed Learning
>  - Autoencoder - Sparse (denoising) Autoencoder
>  - Deep Learning
>  
> ---
> Due to the overhead of Map-Reduce(MR) MR didn't seem to be the best strategy to distribute
the learning of MLPs.
> Therefore the current implementation of the MLP (see MAHOUT-976) should be migrated to
Hama. First all dependencies to Mahout (Matrix-Library) must be removed to get a standalone
MLP Implementation. Then the Hama BSP programming model should be used to realize distributed
learning.
> Different strategies of efficient synchronized weight updates has to be evaluated.
> Resources:
>  Videos:
>     - http://www.youtube.com/watch?v=ZmNOAtZIgIk
>     - http://techtalks.tv/talks/57639/
>  MLP and Deep Learning Tutorial:
>  - http://www.stanford.edu/class/cs294a/
>  Scientific Papers:
>  - Google's "Brain" project: 
> http://research.google.com/archive/large_deep_networks_nips2012.html
>  - Neural Networks and BSP: http://ipdps.cc.gatech.edu/1998/biosp3/bispp4.pdf
>  - http://jmlr.csail.mit.edu/papers/volume11/vincent10a/vincent10a.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message