crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chao Shi (JIRA)" <j...@apache.org>
Subject [jira] [Created] (CRUNCH-352) Share library jars between MR stages
Date Sat, 22 Feb 2014 12:13:19 GMT
Chao Shi created CRUNCH-352:
-------------------------------

             Summary: Share library jars between MR stages
                 Key: CRUNCH-352
                 URL: https://issues.apache.org/jira/browse/CRUNCH-352
             Project: Crunch
          Issue Type: Improvement
            Reporter: Chao Shi


Currently, library jars are copied to the staging directory every time when a MR job submitted.
This is time-consuming when a pipeline consumes tens of stages. To make it even worse, the
job client may run in a network away from cluster.

I found hive and pig have/will have this optimization (HIVE-860 and PIG-2672). Yarn also has
similar plan (YARN-1492).

Although this is better done at Yarn/MR level, we can still do it at client side solution
to benefit users who cannot upgrade to latest Yarn or have to use legacy MRv1.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message