hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ashutosh Chauhan (JIRA)" <j...@apache.org>
Subject [jira] Created: (PIG-865) Performance: Unnnecessary computation in FRJoin
Date Fri, 26 Jun 2009 20:49:47 GMT
Performance: Unnnecessary computation in FRJoin

                 Key: PIG-865
                 URL: https://issues.apache.org/jira/browse/PIG-865
             Project: Pig
          Issue Type: Improvement
          Components: impl
    Affects Versions: 0.3.0
            Reporter: Ashutosh Chauhan
            Priority: Minor
             Fix For: 0.4.0

In POFRJoin implementation POLocalRearrange is used to extract join keys from the input tuples.
If keys match then to perform actual join input tuples are fed to Foreach which does a cross
on its inputs. After keys are extracted using POLocalRearrange output; function getValueTuple(POLocalRearrange
lr, Tuple tuple) is called to reconstruct the input tuple. It seems that this function call
is unnecessary since we already have input tuple at that time. 

This is not a bug, but since this function would get called for every tuple, if it is eliminated,
it should certainly help to improve performance. 

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message