pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rohini Palaniswamy (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PIG-4958) Tez autoparallelism estimation for order by is higher than mapreduce
Date Mon, 25 Jul 2016 02:00:26 GMT

    [ https://issues.apache.org/jira/browse/PIG-4958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15391252#comment-15391252
] 

Rohini Palaniswamy commented on PIG-4958:
-----------------------------------------

 From bin/pig it does not work as tasks do not have RM token. Patch works as is with Oozie
pig action as RM token is always added by Oozie. Even running bin/pig, RM token should be
fetched as AM needs it. Have to figure out how to get that passed to the tasks.

> Tez autoparallelism estimation for order by is higher than mapreduce
> --------------------------------------------------------------------
>
>                 Key: PIG-4958
>                 URL: https://issues.apache.org/jira/browse/PIG-4958
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Rohini Palaniswamy
>            Assignee: Rohini Palaniswamy
>             Fix For: 0.17.0
>
>         Attachments: PIG-4958-withoutsecurity.patch
>
>
>   The input size is calculated from the size of the samples in memory. Size in memory
is usually 4x or more than the serialized size. Mapreduce estimates the number of reducers
based on serialized size.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message