mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sebastian Schelter <...@apache.org>
Subject Re: 0.8
Date Thu, 25 Jul 2013 22:37:37 GMT
It means to aim to remove the following things *from Mahout Math*:

- Lanczos (use SSVD instead)
- Hadoop entropy stuff in org.apache.mahout.math.stats.entropy


2013/7/25 Dmitriy Lyubimov <dlieu.7@gmail.com>

> On Thu, Jul 25, 2013 at 6:44 AM, Suneel Marthi <suneel_marthi@yahoo.com
> >wrote:
>
> > With Isabel's help, updated the 0.8 Release notes on the Wiki and below
> is
> > the text version of the Release notes.
> >
> > Checkout the Wiki version at
> >
> > https://cwiki.apache.org/confluence/display/MAHOUT/Release+0.8
> >
> > -------------------------------------------
> >
> > The Apache Mahout PMC is pleased to announce the release of Mahout 0.8.
> > Mahout's goal is to build scalable machine learning libraries focused
> > primarily in the areas of collaborative filtering (recommenders),
> > clustering and classification (known as the "3Cs"), as well as the
> > necessary infrastructure to support those implementations including, but
> >  not limited to, math packages for statistics, linear algebra and others
> >  as well as Java primitive collections, local and distributed vector and
> >  matrix classes and a variety of integrative code to work with popular
> > packages like Apache Hadoop, Apache Lucene, Apache HBase, Apache
> > Cassandra and much more. The 0.8 release is mainly a clean up release in
> >  preparation for an upcoming 1.0 release, but there are several
> > significant new features, which are highlighted below.
> >
> > To get started with Apache Mahout 0.8, download the release artifacts and
> > signatures at http://www.apache.org/dyn/closer.cgi/mahout. The examples
> > directory contains several working examples of the core
> > functionality available in Mahout. These can be run via scripts in the
> > examples/bin directory. Most examples do not need a Hadoop cluster in
> > order to run.
> >
> > Please pay attention to the section labelled FUTURE PLANS below for more
> > information about upcoming releases of Mahout.
> >
> > As with any release, we wish to thank all of the users and contributors
> > to Mahout. Please see the CHANGELOG [1] and JIRA Release Notes [2] for
> > individual credits, as there are too many to list here.
> >
> > RELEASE HIGHLIGHTS
> >
> > The highlights of the Apache Mahout 0.8 release include, but are not
> > limited to the list below. For further information, see the included
> > CHANGELOG file.
> >
> > - Numerous performance improvements to Vector and Matrix
> > implementations, API's and their iterators (see also MAHOUT-1192,
> > MAHOUT-1202)
> > - Numerous performance improvements to the recommender implementations
> > (see also MAHOUT-1272, MAHOUT-1035, MAHOUT-1042, MAHOUT-1151,
> > MAHOUT-1166, MAHOUT-1167, MAHOUT-1169, MAHOUT-1205, MAHOUT-1264)
> > - MAHOUT-1088: Support for biased item-based recommender
> > - MAHOUT-1089: SGD matrix factorization for rating prediction with user
> > and item biases
> > - MAHOUT-1106: Support for SVD++
> > - MAHOUT-944: Support for converting one or more Lucene storage indexes
> > to SequenceFiles as well as an upgrade of the supported Lucene version
> > to Lucene 4.3.1.
> > - MAHOUT-1154 and friends: New streaming k-means implementation that
> > offers on-line (and fast) clustering
> > - MAHOUT-833: Make conversion to SequenceFiles Map-Reduce, 'seqdirectory'
> > can now be run as a MapReduce job.
> > - MAHOUT-1052: Add an option to MinHashDriver that specifies the
> dimension
> > of vector to hash (indexes or values).
> > - MAHOUT-884: Matrix Concat utility, presently only concatenates two
> > matrices.
> > - MAHOUT-1244: Upgraded to use Lucene 4.3
> > - MAHOUT-1187: Upgraded to CommonsLang3
> > - MAHOUT-916: Speedup the Mahout build by making tests run in parallel.
> > - The usual bug fixes. See JIRA [2] for more
> > information on the 0.8 release.
> >
> > A total of 218 separate JIRA issues are addressed in this release.
> >
> > CONTRIBUTING
> >
> > Mahout is always looking for contributions focused on the 3Cs. If you are
> > interested in contributing, please see our
> > https://cwiki.apache.org/MAHOUT/how-to-contribute.html on the Mahout
> wiki
> > or contact us via email at dev@mahout.apache.org.
> >
> > FUTURE PLANS
> >
> > 0.9
> >
> > As the project moves towards a 1.0 release, the community is working to
> > clean up and/or remove parts of the code base that are under-supported
> > or that underperform as well as to better focus the energy and
> > contributions on key algorithms that are proven to scale in production
> > and have seen wide-spread adoption. To this end, in the next release,
> > the project is planning on removing support for the following algorithms
> >  unless there is sustained support and improvement of them before the
> > next release.
> >
> > The algorithms to be removed are:
> > - From Clustering:
> > Dirichlet
> > MeanShift
> > MinHash
> > Eigencuts
> > - From Classification (both are sequential implementations)
> > Winnow
> > Perceptron
> > - Frequent Pattern Mining
> > - Collaborative Filtering
> > All recommenders in org.apache.mahout.cf.taste.
> > impl.recommender.knn
> > SlopeOne implementations in org.apache.mahout.cf.taste.hadoop.slopeone
> and
> > org.apache.mahout.cf.taste.impl.recommender.slopeone
> > Distributed pseudo recommender in
> org.apache.mahout.cf.taste.hadoop.pseudo
> > TreeClusteringRecommender in org.apache.mahout.cf.taste.impl.recommender
> > - Mahout Math
> >
>
> What does it mean -- remove "Mahout Math"?
>
>
> > Lanczos in favour of SSVD
> > Hadoop entropy stuff in org.apache.mahout.math.stats.entropy
> >
> > If you are interested in supporting 1 or more of these algorithms, please
> > make it known on dev@mahout.apache.org and via JIRA issues that fix
> > and/or improve them. Please also provide
> > supporting evidence as to their effectiveness for you in production.
> >
> > 1.0 PLANS
> >
> > Our plans as a community are to focus 0.9 on cleanup of bugs and the
> > removal of the code mentioned above and then to follow with a 1.0
> > release soon thereafter, at which point the community is committing to
> > the support of the algorithms packaged in the 1.0 for at least two minor
> >  versions after their release. In the case of removal, we will deprecate
> >  the functionality in the 1.(x+1) minor release and remove it in the
> > 1.(x+2) release. For instance, if feature X is to be removed after the
> > 1.2 release, it will be deprecated in 1.3 and removed in 1.4.
> > {quote}
>
>
> > [1]
> >
> http://svn.apache.org/viewvc/mahout/trunk/CHANGELOG?revision=1501110&view=markup
> > [2]
> >
> https://issues.apache.org/jira/issues/?jql=project%20%3D%20MAHOUT%20AND%20fixVersion%20%3D%20%220.8%22
> > ]
> >
> >
> >
> >
> > ________________________________
> >  From: Grant Ingersoll <gsingers@apache.org>
> > To: "dev@mahout.apache.org" <dev@mahout.apache.org>
> > Sent: Wednesday, July 24, 2013 7:51 AM
> > Subject: 0.8
> >
> >
> > 0.8 artifacts are pushed to the mirror location.  I will send an official
> > announcement tomorrow.
> >
> > In the meantime, please review the release notes at:
> > https://cwiki.apache.org/confluence/display/MAHOUT/Release+0.8
> >
> > The new features/fixes section is pretty weak.
> >
> > -Grant
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message