hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alan Gates (JIRA)" <j...@apache.org>
Subject [jira] Commented: (PIG-202) ComparatorFunc provided to ORDER clause is not always honoured
Date Fri, 25 Apr 2008 22:09:56 GMT

    [ https://issues.apache.org/jira/browse/PIG-202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12592518#action_12592518
] 

Alan Gates commented on PIG-202:
--------------------------------

A couple of comments/questions:

# I don't understand your changes to FindQuantiles.  If I read the code correctly, its now
taking the first x values out of the bag, instead of sampling at regular intervals from the
whole bag.  This has the advantage of not needing to read the whole bag (though the code isn't
taking advantage of that), but it will give a much worse sample, at least if the bag is ordered.
 Am I missing something?  Should it be that we take the first n values and then break if the
bag is unordered, and every 1/n values if the bag is ordered?
# Have you done any performance testing to get an idea of the speed up this gives?  Obviously
that will depend on the data set, but it would be interesting to see.

FWIW, I haven't been ignoring your work on this.  It seemed you were making good progress
and getting feedback from Pi, so I hadn't jumped in yet.

> ComparatorFunc provided to ORDER clause is not always honoured
> --------------------------------------------------------------
>
>                 Key: PIG-202
>                 URL: https://issues.apache.org/jira/browse/PIG-202
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Mathieu Poumeyrol
>         Attachments: EvalSpec.patch, InstantiateFunc.patch, MapreducePlanCompiler.patch,
Sort.patch, Sort.v2.patch, TestOderBy.patch
>
>
> Specifying a comparator function is acknowledge neither by local implementation, nor
by quartile lookup job.
> Patch coming soon.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message