On Wed, May 2, 2012 at 9:34 PM, Ted Dunning wrote: > On Wed, May 2, 2012 at 9:05 PM, Jake Mannix wrote: > > > On Wed, May 2, 2012 at 8:07 PM, Ted Dunning > wrote: > > > > > Making a pig module for mahout is a fine idea. The twitter guys may > have > > > something better, though, so we should explore that as well. Andy's > > > comments make that possibility very interesting. > > > > > > > What I'd want to suggest is that anyone who wants to move rapidly on > > pig/mahout > > integration should start a github repo which doesn't directly inject > itself > > into mahout, > > but stands separately for now, but then the maven dependency DAG rears > its > > ugly head: > > > > pig-vector depends on mahout-core > > > > so if we *do* want to start writing cool stuff *in mahout* which depends > on > > it, > > > > I think that we are fine if we just create a pig module in mahout. It can > depend on the external stuff and mahout-core. That would be the natural > time and place to put the fancy pig-vector-ish stuff anyway. > > So I am not worried about this. We would have separation of mahout-pig > stuff from mahout-core-ish stuff and all should be fine. Yeah, most likely the idea would be that mahout-pig would depend on more than just writables, in the long run: UDF wrappers for everything we stuff into one (a la Jimmy Lin et al's "Training a smarter pig" talk at Hadoop World) > > we're circularly dependently self-destruct. Now, if we had a proper > > mahout-writables > > maven module (*ahem*!), which had all the stuff pig-vector needed, and > > mahout-core > > depended on this, then mahout-core (or mahout-examples) could still > depend > > on > > pig-vector (or something like it, like the elephant-bird-loaders slim > dep) > > at some > > point. > > > > I would rather not have Mahout depend on unreleased github stuff. If it is > good enough to depend on, it is good enough to suck into the main > deliverable. > Oh I wasn't meaning core should depend on unreleased stuff, more like the elephant-bird slimmed down module, once released. -- -jake