lucene-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grant Ingersoll <>
Subject [ANN] Apache Mahout 0.2 Released
Date Wed, 18 Nov 2009 13:34:48 GMT
Apache Mahout 0.2 has been released and is now available for public
download at

Apache Mahout is a subproject of Apache Lucene with the goal
of delivering scalable machine learning algorithm implementations
under the Apache license.
Scale in terms of computation to the
size of data you manage today.  Scale in terms of community to support anyone 
interested in using machine learning. Scale
in terms of business by providing the library under a commercially
friendly, free software license. 

Built on top of the powerful map/reduce paradigm of the Apache Hadoop
project, Mahout's goal is to solve popular machine learning problems
like clustering, collaborative filtering and classification
over extremely large data sets over thousands of computers.

Up to date maven artifacts can be found in the Apache repository at

The complete changelist can be found here:

New Mahout 0.2 features include

- Major performance enhancements in Collaborative Filtering,
Classification and Clustering
- New: Latent Dirichlet Allocation(LDA) implementation for topic
- New: Frequent Itemset Mining for mining top-k patterns from a list
of transactions
- New: Decision Forests implementation for Decision Tree classification
(In Memory & Partial Data)
- New: HBase storage support for Naive Bayes model building and
- New: Generation of vectors from Text documents for use with Mahout
- Performance improvements in various Vector implementations
- Tons of bug fixes and code cleanup

Getting started: New to Mahout? 

1) Download Mahout at
2) Check out the Quick start: 

3) Read the Mahout Wiki:
4) Join the community by subscribing to
5) Give back: (optional, but much appreciated!)
6) Consider adding yourself to the power by Wiki page:

For more information on Apache Mahout, see

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message