pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Utkarsh Srivastava <utka...@yahoo-inc.com>
Subject Re: Proposal for adding user defined ordering to pig
Date Mon, 05 Nov 2007 19:27:12 GMT
Hi Alan,

Your language proposal sounds good.

For the implementation proposal, depends on what sorting we are  
talking about. The sorting of a whole table, or the sorting of the  
bags nested within foreach. They are implemented differently. The  
former uses Hadoop, while the latter is done all in our code.

You implementation proposal looks good for the latter (except that  
why would we create a new type of eval spec, we could change  
sortdistinct spec to take an optional comparator argument ?)

For the sorting of the outer bag, we need to look for a way to pass  
the user-defined comparator to Hadoop. Can someone more familiar with  
hadoop internals shed some light on this? Right now, seems to me the  
only way would be to generate a class that has the user-defined  
comparator (because hadoop uses the compareTo method of the keyClass)


On Nov 2, 2007, at 4:05 PM, Alan Gates wrote:

> All,
> I've posted a proposal at http://wiki.apache.org/pig/ 
> UserDefinedOrdering for how to add user defined ordering to pig.   
> This is being urgently requested by some of our users.
> Utkarsh, please review this and make sure I properly understood how  
> to hook things together in the logical and physical plans.  I'm not  
> 100% confident what I proposed will work in the current framework.
> Alan.

View raw message