mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeff Zhang <>
Subject Re: Algorithm implementations in Pig
Date Mon, 22 Feb 2010 08:16:43 GMT

Glad to hear here that mahout devs are interested in pig. Actually I believe
pig is very helpful when you want to quickly implement a prototype of
machine learning algorithms. And Pig has java API, it is easy to integrate
pig script with java.  Maybe we can start with implementing NB using pig

On Mon, Feb 22, 2010 at 3:56 PM, Ted Dunning <> wrote:

> I have had both positive and negative results with PIG.
> The positive results were that I was able to express large recommendation
> computations in a very concise way.  That was really helpful.
> My negative results have been to do with the brittle nature of PIG vis a
> vis
> the version of the underlying hadoop system.  That problem may have abated
> somewhat as everybody in the world except me and Amazon's EMR has pretty
> much piled up on version 20.  I also know little about how Pig would
> interface well with other components.  I know that I have had difficulty in
> the past injecting outside information into Pig, but that has been
> improved.  I also know that "Pigs eat anything", but have no clear idea how
> well this would play out with, say, our vector formats and vectorizers.
> Ankur, what recent experience do you have?  How well do PIG scripts play
> with other programs any more?
> On Sun, Feb 21, 2010 at 11:41 PM, Ankur C. Goel <
> >wrote:
> > I had Sean's opinion on this and he was not too comfortable with the Idea
> > of having things in different languages in Mahout. However, given the
> > benefits of PIG, I feel otherwise. I may be biased here due to my own
> > experience of being able to do more in lesser time in Pig then in  M/R,
> so I
> > thought let me ask how folks feel.
> >
> > Ted, I believe you have some PIG experience yourself so any thoughts on
> > this ?
> >
> --
> Ted Dunning, CTO
> DeepDyve

Best Regards

Jeff Zhang

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message