mahout-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
Subject [CONF] Apache Mahout > Release 0.8
Date Sat, 08 Jun 2013 10:49:00 GMT
Space: Apache Mahout (
Page: Release 0.8 (

Added by Grant Ingersoll:
The Apache Mahout PMC is pleased to announce the release of Mahout 0.8.  Mahout's goal is
to build scalable machine learning libraries focused primarily in the areas of collaborative
filtering (recommenders), clustering and classification (known as the "3Cs"), as well as the
necessary infrastructure to support those implementations including, but not limited to, math
packages for statistics, linear algebra and others as well as Java primitive collections,
local and distributed vector and matrix classes and a variety of integrative code to work
with popular packages like Apache Lucene, Apache HBase, Apache Cassandra and much more.

To get started with Apache Mahout 0.8, download the release artifacts and signatures at FILL
IN HERE.  The examples directory contains several working examples of the core functionality
available in Mahout.  These can be run via scripts in the examples/bin directory.  Most examples
do not need a Hadoop cluster in order to run.

Please pay attention to the section labelled FUTURE PLANS below for more information about
upcoming releases of Mahout.

As with any release, we wish to thank all of the users and contributors to Mahout.  Please
see the CHANGELOG and JIRA for individual credits, as there are too many to list here.


Mahout is always looking for contributions focused on the 3Cs.  If you are interested in contributing,
please see our on the Mahout wiki.


The highlights of the Apache Mahout 0.8 release include, but are not limited to the list below.
 For further information, see the included CHANGELOG file.

- Numerous performance improvements to Vector and Matrix implementations and their iterators
- MAHOUT-944:  Support for converting one or more Lucene storage indexes to SequenceFiles.
- The usual bug fixes.  See JIRA for more information on the 0.8 release.



As the project moves towards a 1.0 release, the community is working to clean up and/or remove
parts of the code base that are under-supported or that underperform as well as to better
focus the energy and contributions on key algorithms that are proven to scale in production
and have seen wide-spread adoption.  To this end, in the next release, the project is planning
on removing support for the following algorithms unless there is sustained support and improvement
of them before the next release.

The algorithms to be removed are:
- From Clustering:
- From Classification (both are sequential implementations)
- Frequent Pattern Mining
- Collaborative Filtering

If you are interested in supporting 1 or more of these algorithms, please make it known on and via JIRA issues that fix and/or improve them.  Please also provide
supporting evidence as to there effectiveness for you in production.


Our plans as a community are to focus 0.9 on cleanup of bugs and the removal of the code mentioned
above and then to follow with a 1.0 release soon thereafter, at which point the community
is committing to the support of the algorithms packaged in the 1.0 for at least two minor
versions after their release.  In the case of removal, we will deprecate the functionality
in the 1.(x+1) minor release and remove it in the 1.(x+2) release.  For instance, if feature
X is to be removed after the 1.2 release, it will be deprecated in 1.3 and removed in 1.4.

Change your notification preferences:

View raw message