hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Olga Natkovich (JIRA)" <j...@apache.org>
Subject [jira] Created: (PIG-1218) Use distributed cache to store samples
Date Wed, 03 Feb 2010 19:23:28 GMT
Use distributed cache to store samples

                 Key: PIG-1218
                 URL: https://issues.apache.org/jira/browse/PIG-1218
             Project: Pig
          Issue Type: Improvement
            Reporter: Olga Natkovich
            Assignee: Richard Ding
             Fix For: 0.7.0

Currently, in the case of skew join and order by we use sample that is just written to the
dfs (not distributed cache) and, as the result, get opened and copied around more than necessary.
This impacts query performance and also places unnecesary load on the name node

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message