hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Olga Natkovich (JIRA)" <j...@apache.org>
Subject [jira] Created: (PIG-465) PERFORMANCE: removing keys from the value
Date Tue, 30 Sep 2008 00:00:44 GMT
PERFORMANCE: removing keys from the value

                 Key: PIG-465
                 URL: https://issues.apache.org/jira/browse/PIG-465
             Project: Pig
          Issue Type: Improvement
    Affects Versions: types_branch
            Reporter: Olga Natkovich
             Fix For: types_branch

Currently, reducers get the key data twice: once in the key and once in the value. If grouping
key is the large part of the value, this causes large data replication and performance loss.

The key should not be sent as part of the value. Instead, a metadata should used to assist
in reconstructing the row from the key and the remaining data

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message