hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alan Gates (JIRA)" <j...@apache.org>
Subject [jira] Commented: (PIG-20) Sorting using custom comparison functions
Date Tue, 27 Nov 2007 18:05:49 GMT

    [ https://issues.apache.org/jira/browse/PIG-20?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12545942
] 

Alan Gates commented on PIG-20:
-------------------------------

Mostly looks good.  There are a few issues that need addressed:

1) Data in the provided unit tests is already sorted before it is fed to the sort tests. 
It should randomized to assure we're really sorting the data and not merely passing it through.

2) In EvalSpec, the comparator can be instantiated either via a call to instantiateFunc or
getComparator.  But in getComparator there is no way to instantiate the user defined comparator.
 It depends instead on instantiateFunc having been called, otherwise it returns the default
comparator.  This doesn't seem right.  This means there is a possible code path where the
user would provide a comparator but the default one would still get used.  If we cannot guarantee
that instantiateFunc has always been called before getComparator, then getComparator needs
to be changed to take a FunctionInstantiator so it can create the user defined function if
necessary.  If we can guarantee that instantiateFunc _should_ always be called before getComparator,
then getComparator should throw a RuntimeException if the member variable comparator is null.

3) Nitpicky code formatting issues:
    a) The pig coding standard is now to use spaces instead of tabs.  In the existing files
you were correct to stay with standard of the surrounding code, but in the new files that
were created spaces should be used.
    b) Lines shouldn't exceed 80 characters in length when possible.



> Sorting  using custom comparison functions
> ------------------------------------------
>
>                 Key: PIG-20
>                 URL: https://issues.apache.org/jira/browse/PIG-20
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>            Reporter: Olga Natkovich
>         Attachments: usercompare.patch
>
>
> Currently, onlu string based sorting is supported. Once we have types, numeric sort will
be supported as well. However, soem users express need for custome comparison functions for
sort.
> Alan put together a design document for this:
> http://wiki.apache.org/pig/UserDefinedOrdering

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message