pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sergey Goder <sergeygo...@gmail.com>
Subject Re: Function To Compute Product of Values in Bag
Date Fri, 03 May 2013 20:36:08 GMT
Thanks for the tip about numerical accuracy issues and the elegant solution
exploiting log/exp. It is very much appreciated.

Sergey


On Fri, May 3, 2013 at 11:42 AM, Kai Londenberg <
kai.londenberg@googlemail.com> wrote:

> Hi,
>
> Just a hint: It's usually better to work with log probabilites and sum
> over them, than to work with raw probabilities and to use
> multiplication. You might easily run into numerical accuracy issues
> otherwise.
>
> i.e. exploit this fact:
>
> product(x1, ..., xn) = exp(sum(log(x1), ..., log(xn)))
>
> best,
>
> Kai Londenberg
>
> 2013/5/3 Sergey Goder <sergeygoder@gmail.com>:
> > I'm creating a multinomial naive bayes classifier using pig and need to
> > compute the product of probabilities. There are an arbitrary number of
> > values in the bag so I would like to be able to use a function similar to
> > the builtin SUM to do this. I looked through the source code and found
> that
> > with some really simple changes to SUM.java I can create a PROD.java
> > function. I included it in my piggybank and have been using it
> successfully.
> >
> > I was curious what the community thought about including this function
> as a
> > builtin function in a future release? Or would it make more sense to keep
> > this function as a udf in a piggybank.
> >
> > Thanks,
> > Sergey
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message