pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "liyunzhang_intel (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PIG-4766) Ensure GroupBy is optimized for all algebraic Operations
Date Thu, 04 Feb 2016 08:55:39 GMT

    [ https://issues.apache.org/jira/browse/PIG-4766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15131978#comment-15131978
] 

liyunzhang_intel commented on PIG-4766:
---------------------------------------

[~pallavi.rao]:  PIG-4766-1.patch looks good except following problem.
org.apache.pig.backend.hadoop.executionengine.spark.converter.ReduceByConverter.MergeValuesFunction
{code}
   public Tuple apply(Tuple v1, Tuple v2) {
            LOG.debug("MergeValuesFunction in : " + v1 + " , " + v2);
            Tuple result = tf.newTuple(2);
            DataBag bag = DefaultBagFactory.getInstance().newDefaultBag();
            Tuple t = new DefaultTuple();
            try {
                // Package the input tuples so they can be processed by Algebraic functions.
                Object key = v1.get(0);
                if (key == null) {
                    key = "";
                } else {
                    result.set(0, key);
                }
   ....
{code}
Is it ok that tuples with null key are considered as same?  for example:  two tuples  (,20)
and (,20), they will be considered to have the same key and execute  poReduce.getNext().
 


> Ensure GroupBy is optimized for all algebraic Operations
> --------------------------------------------------------
>
>                 Key: PIG-4766
>                 URL: https://issues.apache.org/jira/browse/PIG-4766
>             Project: Pig
>          Issue Type: Sub-task
>          Components: spark
>            Reporter: Pallavi Rao
>            Assignee: Pallavi Rao
>              Labels: spork
>             Fix For: spark-branch
>
>         Attachments: PIG-4766-v1.patch, PIG-4766.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message