pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Cheolsoo Park (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (PIG-4074) mapreduce.client.submit.file.replication is not honored in cached files
Date Sat, 26 Jul 2014 01:08:38 GMT

     [ https://issues.apache.org/jira/browse/PIG-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Cheolsoo Park updated PIG-4074:

    Attachment: PIG-4074-2.patch

[~rohini], I created a MRConfiguration class and refactored all the MR properties into constants.
My patch got big, so I uploaded it to RB too-


> mapreduce.client.submit.file.replication is not honored in cached files
> -----------------------------------------------------------------------
>                 Key: PIG-4074
>                 URL: https://issues.apache.org/jira/browse/PIG-4074
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>            Reporter: Cheolsoo Park
>            Assignee: Cheolsoo Park
>             Fix For: 0.14.0
>         Attachments: PIG-4074-1.patch, PIG-4074-2.patch
> Pig ships files to hdfs in several cases (e.g. replicated join, streaming cached files,
etc). But {{mapreduce.client.submit.file.replication}} (or {{mapred.submit.replication}} for
Hadoop 1.x) is not honored, and this has performance impact since many tasks read the same
hdfs blocks in a large cluster.

This message was sent by Atlassian JIRA

View raw message