hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Benjamin Reed <br...@yahoo-inc.com>
Subject Re: Proposal for adding user defined ordering to pig
Date Mon, 05 Nov 2007 19:55:21 GMT
Using the compareTo method is just the default. We can define our own 
WritetableComparator subclass to hook into the Hadoop sorting.

ben

On Monday 05 November 2007 11:27:12 Utkarsh Srivastava wrote:
> Hi Alan,
>
> Your language proposal sounds good.
>
> For the implementation proposal, depends on what sorting we are
> talking about. The sorting of a whole table, or the sorting of the
> bags nested within foreach. They are implemented differently. The
> former uses Hadoop, while the latter is done all in our code.
>
> You implementation proposal looks good for the latter (except that
> why would we create a new type of eval spec, we could change
> sortdistinct spec to take an optional comparator argument ?)
>
> For the sorting of the outer bag, we need to look for a way to pass
> the user-defined comparator to Hadoop. Can someone more familiar with
> hadoop internals shed some light on this? Right now, seems to me the
> only way would be to generate a class that has the user-defined
> comparator (because hadoop uses the compareTo method of the keyClass)
>
> Utkarsh
>
> On Nov 2, 2007, at 4:05 PM, Alan Gates wrote:
> > All,
> >
> > I've posted a proposal at http://wiki.apache.org/pig/
> > UserDefinedOrdering for how to add user defined ordering to pig.
> > This is being urgently requested by some of our users.
> > Utkarsh, please review this and make sure I properly understood how
> > to hook things together in the logical and physical plans.  I'm not
> > 100% confident what I proposed will work in the current framework.
> >
> > Alan.



Mime
View raw message