pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rohini Palaniswamy (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (PIG-4246) HBaseStorage should implement getShipFiles
Date Thu, 23 Oct 2014 22:44:34 GMT

     [ https://issues.apache.org/jira/browse/PIG-4246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Rohini Palaniswamy updated PIG-4246:
    Fix Version/s: 0.14.0
         Assignee: Rohini Palaniswamy
           Status: Patch Available  (was: Open)

> HBaseStorage should implement getShipFiles
> ------------------------------------------
>                 Key: PIG-4246
>                 URL: https://issues.apache.org/jira/browse/PIG-4246
>             Project: Pig
>          Issue Type: Improvement
>            Reporter: Rohini Palaniswamy
>            Assignee: Rohini Palaniswamy
>             Fix For: 0.14.0
>         Attachments: PIG-4246-1.patch
> HBaseStorage.initializeHBaseClassLoaderResources() uses TableMapReduceUtil APIs to add
dependency jars. That sets the tmpjars setting which makes JobClient ship the jars to hdfs
and use that path in distributed cache. That bypasses the optimizations in PIG-2672 and PIG-3861
which avoid shipping the jars to hdfs. Instead it should implement the getShipFiles() API
introduced in PIG-4141 so that PIG-2672 or PIG-3861 avoid shipping the same jar multiple times
to hdfs for a job.

This message was sent by Atlassian JIRA

View raw message