hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daniel Dai (JIRA)" <j...@apache.org>
Subject [jira] Commented: (PIG-890) Create a sampler interface and improve the skewed join sampler
Date Fri, 21 Aug 2009 07:52:14 GMT

    [ https://issues.apache.org/jira/browse/PIG-890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12745843#action_12745843
] 

Daniel Dai commented on PIG-890:
--------------------------------

In your wiki, "For an 1TB file running on nodes which have 512 MB of memory, assuming a conversion
factor of 2, the number of base samples turn out to be 4000", can you give more explanation
on that?

> Create a sampler interface and improve the skewed join sampler
> --------------------------------------------------------------
>
>                 Key: PIG-890
>                 URL: https://issues.apache.org/jira/browse/PIG-890
>             Project: Pig
>          Issue Type: Improvement
>            Reporter: Sriranjan Manjunath
>         Attachments: sampler.patch
>
>
> We need a different sampler for order by and skewed join. We thus need a better sampling
interface. The design of the same is described here: http://wiki.apache.org/pig/PigSampler

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message