mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Drew Farris <drew.far...@gmail.com>
Subject Re: download mahout-0.2 release
Date Thu, 10 Dec 2009 21:33:57 GMT
On Thu, Dec 10, 2009 at 7:26 AM, Isabel Drost <isabel@apache.org> wrote:


> As for the jars we depend on:
>
> I assume that not all of Mahout depends on all libraries. Say the
> clustering code certainly does not depend on HBase. Especially for those
> users who do not want to use maven for their project, it might be pretty
> interesting to know, which libraries are needed by the components they
> are specifically interested in.
>

The dependency reports from maven are pretty helpful to this end. Thanks for
setting these up. It is too bad that deep links into the repo can't be
generated as a part of this report as well.

I agree to not forcing users to use maven. I'm really like maven, but I know
plenty of people who aren't or don't want to be bothered learning it.

To move ahead with a binary release, it is necessary to determine the
minimum set of dependencies we need to re-distribute with the release. The
number of dependencies Mahout has is pretty large, but many of them are
transitive. I suspect many of these are not needed, for example the jetty
and tomcat releated jars pulled in by hadoop and some of the duplicates (2
versions of commons-cli, etc). See:
http://people.apache.org/~isabel/mahout_site/mahout-core/dependencies.htmlfor
the report, as a start.

Grant, do you have a sense of which jars we can redistribute and which we
can't? I did notice javax.mail was in there, but are there others? For that
matter, how is javax.mail used anyway? It is present in the maven/pom.xml,
but doesn't seem to break the build  if it is removed.

It is also worth discussing the goals of a binary release -- and whether it
goes beyond providing pre-built jars, a limited set of dependencies and
allows a number of examples to be run or includes a driver script similar to
that included in hadoop or nutch (as proposed in MAHOUT-185). Does anyone
have thoughts regarding this?

Drew

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message