mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <>
Subject Re: core? util?
Date Tue, 09 Feb 2010 20:20:37 GMT
We (I) have had some problems with dependencies in the past.  Some code
seemed very util, but some other things that seemed pretty core depended on

I think that the real issue for me is that we have two meanings of utils.
One is "generally useful stuff in core" and the other is "things that use
mahout to do cool things".

On Mon, Feb 8, 2010 at 3:08 PM, Sean <> wrote:

> I view util as a place for relatively stand-alone tools and classes
> which perhaps bridge Mahout and other packages. For example this is
> where utilities for converting between Lucene stuff and Mahout live.
> It doesn't quite belong in core since Lucene integration isn't a core
> attribute of Mahout and it doesn't make sense to make Mahout users
> necessarily depend on Lucene.
> utils is somewhere between core and examples. examples is like the
> code that the users will write. core is the code they use. util
> supports using core, for particular use cases.
> I'm sensing part of it should be in core and part in utils. Given that
> guideline above (assuming it's right), could you picture part of it
> being plainly reusable and generic and part being ancillary, support
> code?
> On Mon, Feb 8, 2010 at 7:23 PM, Drew Farris <> wrote:
> > What's the general consensus (if such exists) about what goes in core
> vs.util?
> >
> > Over on MAHOUT-242 there is some discussion about where to put the
> > n-gram / LLR collocation utilities, and since I'm relatively new here
> > I don't feel like I can make a point about it going one place or
> > another without an understanding of the purpose of the different
> > modules.
> >
> > In some ways I can see 242 being a utility -- used for the preparation
> > of language models or something, upon which core algorithms depend. On
> > the other hand I could see mahout including a suite of nlp algorithms
> > in core where 242 is simply a starting point.
> >
> > Drew
> >

Ted Dunning, CTO

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message