Return-Path: X-Original-To: apmail-mahout-user-archive@www.apache.org Delivered-To: apmail-mahout-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 882F1185FE for ; Sun, 31 May 2015 20:58:42 +0000 (UTC) Received: (qmail 39493 invoked by uid 500); 31 May 2015 20:58:39 -0000 Delivered-To: apmail-mahout-user-archive@mahout.apache.org Received: (qmail 39426 invoked by uid 500); 31 May 2015 20:58:39 -0000 Mailing-List: contact user-help@mahout.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@mahout.apache.org Delivered-To: mailing list user@mahout.apache.org Received: (qmail 39406 invoked by uid 99); 31 May 2015 20:58:39 -0000 Received: from mail-relay.apache.org (HELO mail-relay.apache.org) (140.211.11.15) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 31 May 2015 20:58:39 +0000 Received: from mail-wg0-f51.google.com (mail-wg0-f51.google.com [74.125.82.51]) by mail-relay.apache.org (ASF Mail Server at mail-relay.apache.org) with ESMTPSA id 3E8741A01AB; Sun, 31 May 2015 20:58:38 +0000 (UTC) Received: by wgme6 with SMTP id e6so99614398wgm.2; Sun, 31 May 2015 13:58:36 -0700 (PDT) MIME-Version: 1.0 X-Received: by 10.194.60.164 with SMTP id i4mr35275116wjr.133.1433105916064; Sun, 31 May 2015 13:58:36 -0700 (PDT) Received: by 10.27.155.68 with HTTP; Sun, 31 May 2015 13:58:36 -0700 (PDT) Date: Sun, 31 May 2015 16:58:36 -0400 Message-ID: Subject: [ANNOUNCE] Apache Mahout 0.10.1 Released From: Suneel Marthi To: mahout , dev@bigtop.apache.org, dev@flink.apache.org, "user@mahout.apache.org" , user@bigtop.apache.org, user@flink.apache.org, announce@apache.org Content-Type: multipart/alternative; boundary=047d7ba97f1a0d7bf6051766f99f --047d7ba97f1a0d7bf6051766f99f Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable The Apache Mahout PMC is pleased to announce the release of Mahout 0.10.1. Mahout's goal is to create an environment for quickly creating machine learning applications that scale and run on the highest performance parallel computation engines available. Mahout comprises an interactive environment and library that supports generalized scalable linear algebra and includes many modern machine learning algorithms. The Mahout Math environment we call =E2=80=9CSamsara=E2=80=9D for its symbo= l of universal renewal. It reflects a fundamental rethinking of how scalable machine learning algorithms are built and customized. Mahout-Samsara is here to help people create their own math while providing some off-the-shelf algorithm implementations. At its base are general linear algebra and statistical operations along with the data structures to support them. It= =E2=80=99s written in Scala with Mahout-specific extensions, and runs most fully on Spark. To get started with Apache Mahout 0.10.1, download the release artifacts and signatures from http://www.apache.org/dist/mahout/0.10.1/. Many thanks to the contributors and committers who were part of this release. Please see below for the Release Highlights. RELEASE HIGHLIGHTS This is an incremental minor release over Mahout 0.10.0 meant to fix a few bugs and upgrade to Spark 1.2.2 or less. Mahout 0.10.1 1. This release fixes a major memory usage bug in co-occurrence analysis used by the driver spark-itemsimilarity MAHOUT-1707. This will now require far less memory in the executor. 2. Support Spark 1.2.2 or less - due to a bug in Spark 1.2+ in the JavaSerializer (SPARK-6069) we removed the use of Guava from any code executed in Spark Executors. To do this we created a Scala Collections based BiMap so any example code showing how to use the old Guava collections is obsolete. 3. Some minor fixes to Mahout-Samsara QR Decomposition and matrix ops. 4. Trim down packages size to < 200MB - MAHOUT-1704. 5. Minor testing indicates binary compatibility with Spark 1.3 except for the Mahout Shell, which does not run. STATS A total of 9 separate JIRA issues are addressed in this release [2] with 5 bugfixes. Scope of Mahout 0.10.2 ~ targeted for June 28, 2015 1. In-core transpose view rewrites. Modifiable transpose views (for (col <- a.t) col :=3D 5). 2. Matrix structure flavor additions. (understand general matrix structure and stride direction). 3. %*% optimization based on matrix flavors. 4. In-core ::=3D sparse assignment functions. 5. Assign :=3D optimization (do proper traversal based on matrix flavors, similarly to %*%). 6. Adding in-place elementwise functional assignment (e.g. mxA :=3D exp _, mxA ::=3D exp _). 7. Distributed and in-core version of simple elementwise analogues of scala.math._. for example, for log(x) the convention is dlog(drm), mlog(mx), vlog(vec). Unfortunately we cannot overload these functions ov= er what is done in scala.math, i.e. scala would not allow log(mx) or log(dr= m) and log(Double) at the same time, mainly because they are being defined = in different packages. 8. Distributed performance bug fixes. This relates mostly to (a) matrix multiplication deficiencies, and (b) handling parallelism. 9. Distributed allreduceBlock predicate. 10. Distributed optimizer operators for elementwise functions. Rewrites recognizing e.g. 1+ drmX * dexp(drmX) as a single fused elementwise physical operator. 11. More cbind, rbind flavors (e.g. 1 cbind mxX, 1 cbind drmX or the other way around). Mahout 0.11.0-snapshot (ongoing, but available) 1. Support for Spark 1.3 sequence file write. 2. Spark Shell (timing TBD). 3. First release that would see integration of Apache Mahout with Apache Flink as a backend. GETTING STARTED Download the release artifacts and signatures at http://www.apache.org/dist/mahout/0.10.1/ The examples directory contains several working examples of the core functionality available in Mahout. These can be run via scripts in the examples/bin directory. Most examples do not need a Hadoop cluster in order to run. FUTURE PLANS We will continue bug fixes and enhancements on the 0.10.x branch, which will remain dependent on Spark 1.2.x. Support for Spark 1.3 will be in the master branch reflecting Mahout-0.11.0-SNAPSHOT. To see progress on this branch look here: https://github.com/apache/mahout/commits/master. As of this writing it is not ready yet to build for Spark 1.3. Integration with Apache Flink is in the works in collaboration with TU Berlin and Data Artisans to add Flink as the 3rd execution engine to Mahout. This would be in addition to existing Apache Spark and H2O engines. CONTRIBUTING If you are interested in contributing, please see our How to Contribute [3] page or contact us via email at dev@mahout.apache.org. CREDITS As with any release, we wish to thank all of the users and contributors to Mahout. Please see the CHANGELOG [1] and JIRA Release Notes [2] for individual credits, as there are too many to list here. [1] https://github.com/apache/mahout/blob/mahout-0.10.x/CHANGELOG [2] https://issues.apache.org/jira/browse/MAHOUT-1707?jql=3Dproject%20%3D%20MAH= OUT%20AND%20status%20in%20%28Resolved%2C%20closed%29%20AND%20%28fixVersion%= 20%3D%200.10.1%29 [3] http://mahout.apache.org/developers/how-to-contribute.html --047d7ba97f1a0d7bf6051766f99f--