tez-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jl...@apache.org
Subject tez git commit: TEZ-3240. Improvements to tez.lib.uris to allow for multiple tarballs and mixing tarballs and jars. (Eric Badger via hitesh) (cherry picked from commit b3712f863c630cea263a499183f57f9564be6a0f)
Date Fri, 20 May 2016 20:40:04 GMT
Repository: tez
Updated Branches:
  refs/heads/branch-0.8 fa65f3599 -> 6d8eb3a59


TEZ-3240. Improvements to tez.lib.uris to allow for multiple tarballs and mixing tarballs
and jars. (Eric Badger via hitesh)
(cherry picked from commit b3712f863c630cea263a499183f57f9564be6a0f)

Conflicts:

	CHANGES.txt


Project: http://git-wip-us.apache.org/repos/asf/tez/repo
Commit: http://git-wip-us.apache.org/repos/asf/tez/commit/6d8eb3a5
Tree: http://git-wip-us.apache.org/repos/asf/tez/tree/6d8eb3a5
Diff: http://git-wip-us.apache.org/repos/asf/tez/diff/6d8eb3a5

Branch: refs/heads/branch-0.8
Commit: 6d8eb3a59b93ad47d2d4a6e3a193635168d43759
Parents: fa65f35
Author: Jason Lowe <jlowe@apache.org>
Authored: Fri May 20 20:39:40 2016 +0000
Committer: Jason Lowe <jlowe@apache.org>
Committed: Fri May 20 20:39:40 2016 +0000

----------------------------------------------------------------------
 CHANGES.txt                                     |   1 +
 docs/src/site/markdown/install.md               | 166 +++++++++++++++----
 .../org/apache/tez/client/TezClientUtils.java   | 124 +++++++++-----
 .../org/apache/tez/common/TezYARNUtils.java     |  55 +++---
 .../apache/tez/dag/api/TezConfiguration.java    |  21 ++-
 .../apache/tez/client/TestTezClientUtils.java   |  85 ++++++++++
 .../org/apache/tez/common/TestTezYARNUtils.java |  11 ++
 7 files changed, 369 insertions(+), 94 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/tez/blob/6d8eb3a5/CHANGES.txt
----------------------------------------------------------------------
diff --git a/CHANGES.txt b/CHANGES.txt
index 794d06e..5b952c2 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -7,6 +7,7 @@ INCOMPATIBLE CHANGES
 
 ALL CHANGES:
 
+  TEZ-3240. Improvements to tez.lib.uris to allow for multiple tarballs and mixing tarballs
and jars.
   TEZ-3237. Corrupted shuffle transfers to disk are not detected during transfer
   TEZ-3246. Improve diagnostics when DAG killed by user
   TEZ-3258. Jvm Checker does not ignore DisableExplicitGC when checking JVM GC options.

http://git-wip-us.apache.org/repos/asf/tez/blob/6d8eb3a5/docs/src/site/markdown/install.md
----------------------------------------------------------------------
diff --git a/docs/src/site/markdown/install.md b/docs/src/site/markdown/install.md
index a09f4bd..2bc0cfb 100644
--- a/docs/src/site/markdown/install.md
+++ b/docs/src/site/markdown/install.md
@@ -20,14 +20,16 @@
 Install/Deploy Instructions for Tez
 ---------------------------------------------------------------------------
 Replace x.y.z with the tez release number that you are using. E.g. 0.5.0. For Tez 
-versions 0.8.3 and higher, Tez needs Hadoop to be of version 2.6.0 or higher.
+versions 0.8.3 and higher, Tez needs Apache Hadoop to be of version 2.6.0 or higher.
 
 1.  Deploy Apache Hadoop using version of 2.6.0 or higher.
     -   You need to change the value of the hadoop.version property in the
         top-level pom.xml to match the version of the hadoop branch being used.
-        ```
-        $ hadoop version
-        ```
+
+    ```
+    $ hadoop version
+    ```
+
 2.  Build tez using `mvn clean package -DskipTests=true -Dmaven.javadoc.skip=true`
     -   This assumes that you have already installed JDK6 or later and Maven 3 or later.
     -   Tez also requires Protocol Buffers 2.5.0, including the protoc-compiler.
@@ -51,16 +53,16 @@ versions 0.8.3 and higher, Tez needs Hadoop to be of version 2.6.0 or
higher.
         at tez-dist/target/tez-x.y.z-SNAPSHOT.tar.gz
     -   Assuming that the tez jars are put in /apps/ on HDFS, the
         command would be
-        ```
-            hadoop dfs -mkdir /apps/tez-x.y.z-SNAPSHOT
-            hadoop dfs -copyFromLocal tez-dist/target/tez-x.y.z-SNAPSHOT-archive.tar.gz /apps/tez-x.y.z-SNAPSHOT/
-        ```
+
+    ```
+    hadoop dfs -mkdir /apps/tez-x.y.z-SNAPSHOT
+    hadoop dfs -copyFromLocal tez-dist/target/tez-x.y.z-SNAPSHOT-archive.tar.gz /apps/tez-x.y.z-SNAPSHOT/
+    ```
+
     -   tez-site.xml configuration.
         -   Set tez.lib.uris to point to the tar.gz uploaded to HDFS.
             Assuming the steps mentioned so far were followed,
-            ```
-            set tez.lib.uris to "${fs.defaultFS}/apps/tez-x.y.z-SNAPSHOT/tez-x.y.z-SNAPSHOT.tar.gz"
-            ```
+            set tez.lib.uris to `${fs.defaultFS}/apps/tez-x.y.z-SNAPSHOT/tez-x.y.z-SNAPSHOT.tar.gz`
         -   Ensure tez.use.cluster.hadoop-libs is not set in tez-site.xml,
             or if it is set, the value should be false
     -  Please note that the tarball version should match the version of
@@ -75,16 +77,20 @@ versions 0.8.3 and higher, Tez needs Hadoop to be of version 2.6.0 or
higher.
     -   Extract the tez minimal tarball created in step 2 to a local directory
         (assuming TEZ_JARS is where the files will be decompressed for
         the next steps)
-        ```
-        tar -xvzf tez-dist/target/tez-x.y.z-minimal.tar.gz -C $TEZ_JARS
-        ```
+
+    ```
+    tar -xvzf tez-dist/target/tez-x.y.z-minimal.tar.gz -C $TEZ_JARS
+    ```
+
     -   set TEZ_CONF_DIR to the location of tez-site.xml
     -   Add $TEZ_CONF_DIR, ${TEZ_JARS}/* and ${TEZ_JARS}/lib/* to the application classpath.
         For example, doing it via the standard Hadoop tool chain would use the following
command
 	to set up the application classpath:
-        ```
-        export HADOOP_CLASSPATH=${TEZ_CONF_DIR}:${TEZ_JARS}/*:${TEZ_JARS}/lib/*
-        ```
+
+    ```
+    export HADOOP_CLASSPATH=${TEZ_CONF_DIR}:${TEZ_JARS}/*:${TEZ_JARS}/lib/*
+    ```
+
     -   Please note the "*" which is an important requirement when
         setting up classpaths for directories containing jar files.
 6.  There is a basic example of using an MRR job in the tez-examples.jar.
@@ -126,22 +132,126 @@ versions 0.8.3 and higher, Tez needs Hadoop to be of version 2.6.0
or higher.
     can be verified by looking at the AM’s logs from the YARN ResourceManager UI.
     This needs mapred-site.xml to have "mapreduce.framework.name" set to "yarn-tez"
 
+Various ways to configure tez.lib.uris
+---------------------------------------
+
+The `tez.lib.uris` configuration property supports a comma-separated list of values. The

+types of values supported are:
+  - Path to simple file
+  - Path to a directory
+  - Path to a compressed archive ( tarball, zip, etc).
+
+For simple files and directories, Tez will add all these files and first-level entries in
the
+directories (recursive traversal of dirs is not supported) into the working directory of
the
+Tez runtime and they will automatically be included into the classpath. For archives i.e.
+files whose names end with generally known compressed archive suffixes such as 'tgz',
+'tar.gz', 'zip', etc. will be uncompressed into the container working directory too. However,
+given that the archive structure is not known to the Tez framework, the user is expected
to
+configure `tez.lib.uris.classpath` to ensure that the nested directory structure of an
+archive is added to the classpath. This classpath values should be relative i.e. the entries
+should start with "./".
+
 Hadoop Installation dependent Install/Deploy Instructions
 ---------------------------------------------------------
+
 The above install instructions use Tez with pre-packaged Hadoop libraries included in the
package and is the
-recommended method for installation. If its needed to make Tez use the existing cluster Hadoop
libraries then
-follow this alternate machanism to setup Tez to use Hadoop libraries from the cluster.
-Step 3 above changes as follows. Also subsequent steps would use tez-dist/target/tez-x.y.z-minimal.tar.gz
instead of tez-dist/target/tez-x.y.z.tar.gz
-- A tez build without Hadoop dependencies will be available at tez-dist/target/tez-x.y.z-minimal.tar.gz
-- Assuming that the tez jars are put in /apps/ on HDFS, the command would be
-"hadoop fs -mkdir /apps/tez-x.y.z"
-"hadoop fs -copyFromLocal tez-dist/target/tez-x.y.z-minimal.tar.gz /apps/tez-x.y.z"
-- tez-site.xml configuration
-- Set tez.lib.uris to point to the paths in HDFS containing the tez jars. Assuming the steps
mentioned so far were followed,
-set tez.lib.uris to "${fs.defaultFS}/apps/tez-x.y.z/tez-x.y.z-minimal.tar.gz
-- set tez.use.cluster.hadoop-libs to true
+recommended method for installation. A full tarball with all dependencies is a better approach
to ensure
+that existing jobs continue to run during a cluster's rolling upgrade.
+
+Although the `tez.lib.uris` configuration options enable a wide variety of usage patterns,
there
+are 2 main alternative modes that are supported by the framework: 
+
+1. Mode A: Using a tez tarball on HDFS along with Hadoop libraries available on the cluster.
+2. Mode B: Using a tez tarball along with the Hadoop tarball.
+ 
+Both these modes will require a tez build without Hadoop dependencies and that is available
at
+tez-dist/target/tez-x.y.z-minimal.tar.gz.
+
+For Mode A: Tez tarball with using existing cluster Hadoop libraries by leveraging yarn.application.classpath
+-------------------------------------------------------------------------------------------------------------
+
+This mode is not recommended for clusters that use rolling upgrades. Additionally, it is
the user's responsibility
+to ensure that the tez version being used is compatible with the version of Hadoop running
on the cluster.
+Step 3 above changes as follows. Also subsequent steps should use tez-dist/target/tez-x.y.z-minimal.tar.gz
+instead of tez-dist/target/tez-x.y.z.tar.gz
+
+  - A tez build without Hadoop dependencies will be available at tez-dist/target/tez-x.y.z-minimal.tar.gz
+    Assuming that the tez jars are put in /apps/ on HDFS, the command would be
+
+    ```
+    "hadoop fs -mkdir /apps/tez-x.y.z"
+    "hadoop fs -copyFromLocal tez-dist/target/tez-x.y.z-minimal.tar.gz /apps/tez-x.y.z"
+    ```
+
+  - tez-site.xml configuration
+    - Set tez.lib.uris to point to the paths in HDFS containing the tez jars. Assuming the
steps mentioned so far were followed,
+set tez.lib.uris to `${fs.defaultFS}/apps/tez-x.y.z/tez-x.y.z-minimal.tar.gz`
+    - Set tez.use.cluster.hadoop-libs to true
+
+For Mode B: Tez tarball with Hadoop tarball
+--------------------------------------------
+
+This mode will support rolling upgrades. It is the user's responsibility to ensure that the
+versions of Tez and Hadoop being used are compatible.
+To do this configuration, we need to change Step 3 of the
+default instructions in the following ways.
+
+  - Assuming that the tez archives/jars are put in /apps/ on HDFS, the command to put this
+minimal Tez archive into HDFS would be:
+
+  ```
+  "hadoop fs -mkdir /apps/tez-x.y.z"
+  "hadoop fs -copyFromLocal tez-dist/target/tez-x.y.z-minimal.tar.gz /apps/tez-x.y.z"
+  ```
+
+  - Alternatively, you can put the minimal directory directly into HDFS and
+  reference the jars, instead of using an archive. The command to put
+  the minimal directory into HDFS would be:
+
+  ```
+  "hadoop fs -copyFromLocal tez-dist/target/tez-x.y.z-minimal/* /apps/tez-x.y.z"
+  ```
+
+  - After building hadoop, the hadoop tarball will be available at
+  hadoop/hadoop-dist/target/hadoop-x.y.z-SNAPSHOT.tar.gz
+  - Assuming that the hadoop jars are put in /apps/ on HDFS, the command to put this
+    Hadoop archive into HDFS would be:
+
+  ```
+  "hadoop fs -mkdir /apps/hadoop-x.y.z"
+  "hadoop fs -copyFromLocal hadoop-dist/target/hadoop-x.y.z-SNAPSHOT.tar.gz /apps/hadoop-x.y.z"
+  ```
+
+  - tez-site.xml configuration
+     - Set tez.lib.uris to point to the the archives and jars that are needed for Tez/Hadoop.
+
+     - Example: When using both Tez and Hadoop archives, set tez.lib.uris to
+     `${fs.defaultFS}/apps/tez-x.y.z/tez-x.y.z-minimal.tar.gz#tez,${fs.defaultFS}/apps/hadoop-x.y.z/hadoop-x.y.z-SNAPSHOT.tar.gz#hadoop-mapreduce`
+
+    - Example: When using Tez jars with a Hadoop archive, set tez.lib.uris to:
+    `${fs.defaultFS}/apps/tez-x.y.z,${fs.defaultFS}/apps/tez-x.y.z/lib,${fs.defaultFS}/apps/hadoop-x.y.z/hadoop-x.y.z-SNAPSHOT.tar.gz#hadoop-mapreduce`
+
+    - In tez.lib.uris, the text immediately following the '#' symbol is the fragment that
+      refers to the symlink that will be created for the archive.  If no fragment is given,
+      the symlink will be set to the name of the archive. Fragments should not be given
+      to directories or jars.
+
+    - If any archives are specified in tez.lib.uris, then tez.lib.uris.classpath must be
set
+      to define the classpath for these archives as the archive structure is not known. 
+    - Example: Classpath when using both Tez and Hadoop archives, set tez.lib.uris.classpath
to:
+
+    ```
+./tez/*:./tez/lib/*:./hadoop-mapreduce/hadoop-x.y.z-SNAPSHOT/share/hadoop/common/*:./hadoop-mapreduce/hadoop-x.y.z-SNAPSHOT/share/hadoop/common/lib/*:./hadoop-mapreduce/hadoop-x.y.z-SNAPSHOT/share/hadoop/hdfs/*:./hadoop-mapreduce/hadoop-x.y.z-SNAPSHOT/share/hadoop/hdfs/lib/*:./hadoop-mapreduce/hadoop-x.y.z-SNAPSHOT/share/hadoop/yarn/*:./hadoop-mapreduce/hadoop-x.y.z-SNAPSHOT/share/hadoop/yarn/lib/*:./hadoop-mapreduce/hadoop-x.y.z-SNAPSHOT/share/hadoop/mapreduce/*:./hadoop-mapreduce/hadoop-x.y.z-SNAPSHOT/share/hadoop/mapreduce/lib/*
+    ```
+
+    - Example: Classpath when using Tez jars with a Hadoop archive, set tez.lib.uris.classpath
to:
+
+    ```
+./hadoop-mapreduce/hadoop-x.y.z-SNAPSHOT/share/hadoop/common/*:./hadoop-mapreduce/hadoop-x.y.z-SNAPSHOT/share/hadoop/common/lib/*:./hadoop-mapreduce/hadoop-x.y.z-SNAPSHOT/share/hadoop/hdfs/*:./hadoop-mapreduce/hadoop-x.y.z-SNAPSHOT/share/hadoop/hdfs/lib/*:./hadoop-mapreduce/hadoop-x.y.z-SNAPSHOT/share/hadoop/yarn/*:./hadoop-mapreduce/hadoop-x.y.z-SNAPSHOT/share/hadoop/yarn/lib/*:./hadoop-mapreduce/hadoop-x.y.z-SNAPSHOT/share/hadoop/mapreduce/*:./hadoop-mapreduce/hadoop-x.y.z-SNAPSHOT/share/hadoop/mapreduce/lib/*
+    ```
 
 
 [Install instructions for older versions of Tez (pre 0.5.0)](./install_pre_0_5_0.html)
 -----------------------------------------------------------------------------------
 
+

http://git-wip-us.apache.org/repos/asf/tez/blob/6d8eb3a5/tez-api/src/main/java/org/apache/tez/client/TezClientUtils.java
----------------------------------------------------------------------
diff --git a/tez-api/src/main/java/org/apache/tez/client/TezClientUtils.java b/tez-api/src/main/java/org/apache/tez/client/TezClientUtils.java
index 43a97fa..a1ad1d8 100644
--- a/tez-api/src/main/java/org/apache/tez/client/TezClientUtils.java
+++ b/tez-api/src/main/java/org/apache/tez/client/TezClientUtils.java
@@ -73,6 +73,7 @@ import org.apache.hadoop.yarn.api.records.LocalResourceType;
 import org.apache.hadoop.yarn.api.records.LocalResourceVisibility;
 import org.apache.hadoop.yarn.api.records.Priority;
 import org.apache.hadoop.yarn.api.records.Resource;
+import org.apache.hadoop.yarn.api.records.URL;
 import org.apache.hadoop.yarn.api.records.YarnApplicationState;
 import org.apache.hadoop.yarn.conf.YarnConfiguration;
 import org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException;
@@ -132,7 +133,8 @@ public class TezClientUtils {
 
     Path p = new Path(uri);
     FileSystem fs = p.getFileSystem(conf);
-    p = fs.resolvePath(p);
+    p = fs.resolvePath(p.makeQualified(fs.getUri(),
+        fs.getWorkingDirectory()));
     FileSystem targetFS = p.getFileSystem(conf); 
     if (targetFS.isDirectory(p)) {
       return targetFS.listStatus(p);
@@ -175,37 +177,12 @@ public class TezClientUtils {
 
       LOG.info("Using tez.lib.uris value from configuration: "
           + conf.get(TezConfiguration.TEZ_LIB_URIS));
+      LOG.info("Using tez.lib.uris.classpath value from configuration: "
+          + conf.get(TezConfiguration.TEZ_LIB_URIS_CLASSPATH));
 
-      if (tezJarUris.length == 1 && (
-              tezJarUris[0].endsWith(".tar.gz") ||
-              tezJarUris[0].endsWith(".tgz") ||
-              tezJarUris[0].endsWith(".zip") ||
-              tezJarUris[0].endsWith(".tar"))) {
-        String fileName = tezJarUris[0];
+      usingTezArchive = addLocalResources(conf, tezJarUris,
+                         tezJarResources, credentials);
 
-        FileStatus fStatus = getLRFileStatus(fileName, conf)[0];
-        LocalResourceVisibility lrVisibility;
-        if (checkAncestorPermissionsForAllUsers(conf, fileName, FsAction.EXECUTE) &&
-            fStatus.getPermission().getOtherAction().implies(FsAction.READ)) {
-          lrVisibility = LocalResourceVisibility.PUBLIC;
-        } else {
-          lrVisibility = LocalResourceVisibility.PRIVATE;
-        }
-        tezJarResources.put(TezConstants.TEZ_TAR_LR_NAME,
-            LocalResource.newInstance(
-                ConverterUtils.getYarnUrlFromPath(fStatus.getPath()),
-                LocalResourceType.ARCHIVE,
-                lrVisibility,
-                fStatus.getLen(),
-                fStatus.getModificationTime()));
-        Path[] tezJarPaths = { fStatus.getPath() };
-        // obtain credentials
-        TokenCache.obtainTokensForFileSystems(credentials, tezJarPaths, conf);
-        usingTezArchive = true;
-      } else { // Treat as non-archives
-        addLocalResources(conf, tezJarUris, tezJarResources, credentials);
-      }
-      
       if (tezJarResources.isEmpty()) {
         throw new TezUncheckedException(
             "No files found in locations specified in "
@@ -221,41 +198,101 @@ public class TezClientUtils {
     return usingTezArchive;
   }
 
-  private static void addLocalResources(Configuration conf, String[] configUris,
-      Map<String, LocalResource> tezJarResources, Credentials credentials) throws IOException
{
+  private static boolean addLocalResources(Configuration conf,
+      String[] configUris, Map<String, LocalResource> tezJarResources,
+      Credentials credentials) throws IOException {
+    boolean usingTezArchive = false;
     if (configUris == null || configUris.length == 0) {
-      return;
+      return usingTezArchive;
     }
     List<Path> configuredPaths = Lists.newArrayListWithCapacity(configUris.length);
     for (String configUri : configUris) {
-      boolean ancestorsHavePermission = checkAncestorPermissionsForAllUsers(conf, configUri,
-          FsAction.EXECUTE);
+      URI u = null;
+      try {
+        u = new URI(configUri);
+      } catch (URISyntaxException e) {
+        throw new IOException("Unable to convert " + configUri + "to URI", e);
+      }
+      Path p = new Path(u);
+      FileSystem remoteFS = p.getFileSystem(conf);
+      p = remoteFS.resolvePath(p.makeQualified(remoteFS.getUri(),
+          remoteFS.getWorkingDirectory()));
+
+      LocalResourceType type = null;
+
+      //Check if path is an archive
+      if(p.getName().endsWith(".tar.gz") ||
+           p.getName().endsWith(".tgz") ||
+           p.getName().endsWith(".zip") ||
+           p.getName().endsWith(".tar")) {
+          type = LocalResourceType.ARCHIVE;
+        } else {
+          type = LocalResourceType.FILE;
+        }
+
       FileStatus [] fileStatuses = getLRFileStatus(configUri, conf);
+
       for (FileStatus fStatus : fileStatuses) {
+        String linkName;
         if (fStatus.isDirectory()) {
           // Skip directories - no recursive search support.
           continue;
         }
+        // If the resource is an archive, we've already done this work
+        if(type != LocalResourceType.ARCHIVE) {
+          u = fStatus.getPath().toUri();
+          p = new Path(u);
+          remoteFS = p.getFileSystem(conf);
+          p = remoteFS.resolvePath(p.makeQualified(remoteFS.getUri(),
+              remoteFS.getWorkingDirectory()));
+          if(null != u.getFragment()) {
+            LOG.warn("Fragment set for link being interpreted as a file," +
+                "URI: " + u.toString());
+          }
+        }
+
+        // Add URI fragment or just the filename
+        Path name = new Path((null == u.getFragment())
+          ? p.getName()
+          : u.getFragment());
+        if (name.isAbsolute()) {
+          throw new IllegalArgumentException("Resource name must be "
+              + "relative, not absolute: " + name
+              + " in URI: " + u.toString());
+        }
+
+        URL url = ConverterUtils.getYarnUrlFromURI(p.toUri());
+        linkName = name.toUri().getPath();
+        // For legacy reasons, set archive to tezlib if there is
+        // only a single archive and no fragment
+        if(type == LocalResourceType.ARCHIVE &&
+            configUris.length == 1 && null == u.getFragment()) {
+          linkName = TezConstants.TEZ_TAR_LR_NAME;
+          usingTezArchive = true;
+        }
+
         LocalResourceVisibility lrVisibility;
-        if (ancestorsHavePermission &&
+        if (checkAncestorPermissionsForAllUsers(conf, url.getFile(),
+            FsAction.EXECUTE) &&
             fStatus.getPermission().getOtherAction().implies(FsAction.READ)) {
           lrVisibility = LocalResourceVisibility.PUBLIC;
         } else {
           lrVisibility = LocalResourceVisibility.PRIVATE;
         }
-        String rsrcName = fStatus.getPath().getName();
-        if (tezJarResources.containsKey(rsrcName)) {
+
+        if (tezJarResources.containsKey(linkName)) {
           String message = "Duplicate resource found"
-              + ", resourceName=" + rsrcName
+              + ", resourceName=" + linkName
               + ", existingPath=" +
-              tezJarResources.get(rsrcName).getResource().toString()
+              tezJarResources.get(linkName).getResource().toString()
               + ", newPath=" + fStatus.getPath();
           LOG.warn(message);
         }
-        tezJarResources.put(rsrcName,
+
+        tezJarResources.put(linkName,
             LocalResource.newInstance(
-                ConverterUtils.getYarnUrlFromPath(fStatus.getPath()),
-                LocalResourceType.FILE,
+                url,
+                type,
                 lrVisibility,
                 fStatus.getLen(),
                 fStatus.getModificationTime()));
@@ -267,6 +304,7 @@ public class TezClientUtils {
       TokenCache.obtainTokensForFileSystems(credentials,
           configuredPaths.toArray(new Path[configuredPaths.size()]), conf);
     }
+    return usingTezArchive;
   }
 
   static void processTezLocalCredentialsFile(Credentials credentials, Configuration conf)

http://git-wip-us.apache.org/repos/asf/tez/blob/6d8eb3a5/tez-api/src/main/java/org/apache/tez/common/TezYARNUtils.java
----------------------------------------------------------------------
diff --git a/tez-api/src/main/java/org/apache/tez/common/TezYARNUtils.java b/tez-api/src/main/java/org/apache/tez/common/TezYARNUtils.java
index d7093db..c505ca8 100644
--- a/tez-api/src/main/java/org/apache/tez/common/TezYARNUtils.java
+++ b/tez-api/src/main/java/org/apache/tez/common/TezYARNUtils.java
@@ -23,6 +23,8 @@ import java.util.Map;
 import java.util.regex.Matcher;
 import java.util.regex.Pattern;
 
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
 import org.apache.hadoop.classification.InterfaceAudience.Private;
 import org.apache.hadoop.conf.Configuration;
 import org.apache.hadoop.util.Shell;
@@ -35,6 +37,7 @@ import org.apache.tez.dag.api.TezConstants;
 
 @Private
 public class TezYARNUtils {
+  private static Logger LOG = LoggerFactory.getLogger(TezYARNUtils.class);
   
   private static Pattern ENV_VARIABLE_PATTERN = Pattern.compile(Shell.getEnvironmentVariableRegex());
 
@@ -54,27 +57,41 @@ public class TezYARNUtils {
         .append(Environment.PWD.$() + File.separator + "*")
         .append(File.pathSeparator);
 
-    // Next add the tez libs, if specified via an archive.
-    if (usingArchive) {
-      // Add PWD/tezlib/*
-      classpathBuilder.append(Environment.PWD.$())
-          .append(File.separator)
-          .append(TezConstants.TEZ_TAR_LR_NAME)
-          .append(File.separator)
-          .append("*")
-          .append(File.pathSeparator);
+    String [] tezLibUrisClassPath = conf.getStrings(TezConfiguration.TEZ_LIB_URIS_CLASSPATH);
 
-      // Add PWD/tezlib/lib/*
-      classpathBuilder.append(Environment.PWD.$())
-          .append(File.separator)
-          .append(TezConstants.TEZ_TAR_LR_NAME)
-          .append(File.separator)
-          .append("lib")
-          .append(File.separator)
-          .append("*")
-          .append(File.pathSeparator);
-    }
+    if(!conf.getBoolean(TezConfiguration.TEZ_IGNORE_LIB_URIS, false) &&
+       tezLibUrisClassPath != null && tezLibUrisClassPath.length != 0) {
+      for(String c : tezLibUrisClassPath) {
+        classpathBuilder.append(c.trim())
+        .append(File.pathSeparator);
+      }
+    } else {
+      if(conf.getBoolean(TezConfiguration.TEZ_IGNORE_LIB_URIS, false)) {
+        LOG.info("Ignoring '" + TezConfiguration.TEZ_LIB_URIS + "' since  '" +
+            TezConfiguration.TEZ_IGNORE_LIB_URIS + "' is set to true ");
+      }
+
+      // Legacy: Next add the tez libs, if specified via an archive.
+      if (usingArchive) {
+        // Add PWD/tezlib/*
+        classpathBuilder.append(Environment.PWD.$())
+            .append(File.separator)
+            .append(TezConstants.TEZ_TAR_LR_NAME)
+            .append(File.separator)
+            .append("*")
+            .append(File.pathSeparator);
 
+        // Legacy: Add PWD/tezlib/lib/*
+        classpathBuilder.append(Environment.PWD.$())
+            .append(File.separator)
+            .append(TezConstants.TEZ_TAR_LR_NAME)
+            .append(File.separator)
+            .append("lib")
+            .append(File.separator)
+            .append("*")
+            .append(File.pathSeparator);
+      }
+    }
     // Last add HADOOP_CLASSPATH, if it's required.
     if (conf.getBoolean(TezConfiguration.TEZ_USE_CLUSTER_HADOOP_LIBS,
         TezConfiguration.TEZ_USE_CLUSTER_HADOOP_LIBS_DEFAULT)) {

http://git-wip-us.apache.org/repos/asf/tez/blob/6d8eb3a5/tez-api/src/main/java/org/apache/tez/dag/api/TezConfiguration.java
----------------------------------------------------------------------
diff --git a/tez-api/src/main/java/org/apache/tez/dag/api/TezConfiguration.java b/tez-api/src/main/java/org/apache/tez/dag/api/TezConfiguration.java
index 6785405..9f9defe 100644
--- a/tez-api/src/main/java/org/apache/tez/dag/api/TezConfiguration.java
+++ b/tez-api/src/main/java/org/apache/tez/dag/api/TezConfiguration.java
@@ -1019,19 +1019,19 @@ public class TezConfiguration extends Configuration {
    * The location of the Tez libraries which will be localized for DAGs.
    * This follows the following semantics
    * <ol>
-   * <li> To use a single .tar.gz or .tgz file (generated by the tez build), the full
path to this
+   * <li> To use .tar.gz or .tgz files (generated by the tez or hadoop builds), the
full path to this
    * file (including filename) should be specified. The internal structure of the uncompressed
tgz
-   * will be retained under $CWD/tezlib.</li>
+   * will be defined by 'tez.lib.uris.classpath'</li>
    *
    * <li> If a single file is specified without the above mentioned extensions - it
will be treated as
    * a regular file. This means it will not be uncompressed during runtime. </li>
    *
    * <li> If multiple entries exist
    * <ul>
-   * <li> Files: will be treated as regular files (not uncompressed during runtime)
</li>
+   * <li> Regular Files: will be treated as regular files (not uncompressed during
runtime) </li>
+   * <li> Archive Files: will be treated as archives and will be uncompressed during
runtime </li>
    * <li> Directories: all files under the directory (non-recursive) will be made available
(but not
    * uncompressed during runtime). </li>
-   * <li> All files / contents of directories are flattened into a single directory
- $CWD </li>
    * </ul>
    * </ol>
    */
@@ -1040,6 +1040,19 @@ public class TezConfiguration extends Configuration {
   public static final String TEZ_LIB_URIS = TEZ_PREFIX + "lib.uris";
 
   /**
+   *
+   * Specify additional user classpath information to be used for Tez AM and all containers.
+   * This will be appended to the classpath after PWD
+   * 
+   * 'tez.lib.uris.classpath' defines the relative classpath into the archives
+   * that are set in 'tez.lib.uris'
+   *
+   */
+  @ConfigurationScope(Scope.AM)
+  @ConfigurationProperty
+  public static final String TEZ_LIB_URIS_CLASSPATH = TEZ_PREFIX + "lib.uris.classpath";
+
+  /**
    * Auxiliary resources to be localized for the Tez AM and all its containers.
    *
    * Value is comma-separated list of fully-resolved directories or file paths. All resources

http://git-wip-us.apache.org/repos/asf/tez/blob/6d8eb3a5/tez-api/src/test/java/org/apache/tez/client/TestTezClientUtils.java
----------------------------------------------------------------------
diff --git a/tez-api/src/test/java/org/apache/tez/client/TestTezClientUtils.java b/tez-api/src/test/java/org/apache/tez/client/TestTezClientUtils.java
index bcf3239..4948260 100644
--- a/tez-api/src/test/java/org/apache/tez/client/TestTezClientUtils.java
+++ b/tez-api/src/test/java/org/apache/tez/client/TestTezClientUtils.java
@@ -207,6 +207,91 @@ public class TestTezClientUtils {
     assertFalse(localizedMap.isEmpty());
   }
 
+    /**
+   *
+   */
+  @Test (timeout=5000)
+  public void validateSetTezJarLocalResourcesMultipleTarballs() throws Exception {
+    FileSystem localFs = FileSystem.getLocal(new Configuration());
+    StringBuilder tezLibUris = new StringBuilder();
+
+    // Create 2 files
+    Path topDir = new Path(TEST_ROOT_DIR, "validatemultipletarballs");
+    if (localFs.exists(topDir)) {
+      localFs.delete(topDir, true);
+    }
+    localFs.mkdirs(topDir);
+
+    Path tarFile1 = new Path(topDir, "f1.tar.gz");
+    Path tarFile2 = new Path(topDir, "f2.tar.gz");
+
+    Assert.assertTrue(localFs.createNewFile(tarFile1));
+    Assert.assertTrue(localFs.createNewFile(tarFile2));
+    tezLibUris.append(localFs.makeQualified(tarFile1).toString()).append("#tar1").append(",");
+    tezLibUris.append(localFs.makeQualified(tarFile2).toString()).append("#tar2").append(",");
+
+    TezConfiguration conf = new TezConfiguration();
+    conf.set(TezConfiguration.TEZ_LIB_URIS, tezLibUris.toString());
+    Credentials credentials = new Credentials();
+    Map<String, LocalResource> localizedMap = new HashMap<String, LocalResource>();
+    TezClientUtils.setupTezJarsLocalResources(conf, credentials, localizedMap);
+    Set<String> resourceNames = localizedMap.keySet();
+    Assert.assertEquals(2, resourceNames.size());
+    Assert.assertTrue(resourceNames.contains("tar1"));
+    Assert.assertTrue(resourceNames.contains("tar2"));
+    Assert.assertFalse(resourceNames.contains("f1.tar.gz"));
+    Assert.assertFalse(resourceNames.contains("f2.tar.gz"));
+
+
+    Assert.assertTrue(localFs.delete(tarFile1, true));
+    Assert.assertTrue(localFs.delete(tarFile2, true));
+    Assert.assertTrue(localFs.delete(topDir, true));
+  }
+
+    /**
+   *
+   */
+  @Test (timeout=5000)
+  public void validateSetTezJarLocalResourcesMixTarballAndJar() throws Exception {
+    FileSystem localFs = FileSystem.getLocal(new Configuration());
+    StringBuilder tezLibUris = new StringBuilder();
+
+    // Create 2 jars and 1 archive
+    Path topDir = new Path(TEST_ROOT_DIR, "validatetarballandjar");
+    if (localFs.exists(topDir)) {
+      localFs.delete(topDir, true);
+    }
+    localFs.mkdirs(topDir);
+
+    Path tarFile1 = new Path(topDir, "f1.tar.gz");
+    Path jarFile2 = new Path(topDir, "f2.jar");
+    Path jarFile3 = new Path(topDir, "f3.jar");
+
+    Assert.assertTrue(localFs.createNewFile(tarFile1));
+    Assert.assertTrue(localFs.createNewFile(jarFile2));
+    Assert.assertTrue(localFs.createNewFile(jarFile3));
+
+    tezLibUris.append(localFs.makeQualified(topDir).toString()).append(",");
+    tezLibUris.append(localFs.makeQualified(tarFile1).toString()).append("#tar1").append(",");
+
+    TezConfiguration conf = new TezConfiguration();
+    conf.set(TezConfiguration.TEZ_LIB_URIS, tezLibUris.toString());
+    Credentials credentials = new Credentials();
+    Map<String, LocalResource> localizedMap = new HashMap<String, LocalResource>();
+    TezClientUtils.setupTezJarsLocalResources(conf, credentials, localizedMap);
+    Set<String> resourceNames = localizedMap.keySet();
+    Assert.assertEquals(4, resourceNames.size());
+    Assert.assertTrue(resourceNames.contains("tar1"));
+    Assert.assertTrue(resourceNames.contains("f1.tar.gz"));
+    Assert.assertTrue(resourceNames.contains("f2.jar"));
+    Assert.assertTrue(resourceNames.contains("f3.jar"));
+
+    Assert.assertTrue(localFs.delete(tarFile1, true));
+    Assert.assertTrue(localFs.delete(jarFile2, true));
+    Assert.assertTrue(localFs.delete(jarFile3, true));
+    Assert.assertTrue(localFs.delete(topDir, true));
+  }
+
   @Test(timeout = 2000)
   // this test checks if the priority field is set properly in the
   // ApplicationSubmissionContext

http://git-wip-us.apache.org/repos/asf/tez/blob/6d8eb3a5/tez-api/src/test/java/org/apache/tez/common/TestTezYARNUtils.java
----------------------------------------------------------------------
diff --git a/tez-api/src/test/java/org/apache/tez/common/TestTezYARNUtils.java b/tez-api/src/test/java/org/apache/tez/common/TestTezYARNUtils.java
index 6e9e06c..2dabf51 100644
--- a/tez-api/src/test/java/org/apache/tez/common/TestTezYARNUtils.java
+++ b/tez-api/src/test/java/org/apache/tez/common/TestTezYARNUtils.java
@@ -80,4 +80,15 @@ public class TestTezYARNUtils {
     Assert.assertEquals("User env should append default env",
         Environment.PWD.$() + File.pathSeparator + "USER_PATH" + File.pathSeparator + "DEFAULT_PATH",
value3);
     }
+
+  @Test(timeout = 5000)
+  public void testTezLibUrisClasspath() {
+    Configuration conf = new Configuration(false);
+    conf.set(TezConfiguration.TEZ_LIB_URIS_CLASSPATH, "foobar");
+    String classpath = TezYARNUtils.getFrameworkClasspath(conf, true);
+    Assert.assertTrue(classpath.contains("foobar"));
+    Assert.assertTrue(classpath.contains(Environment.PWD.$()));
+    Assert.assertTrue(classpath.indexOf("foobar") >
+        classpath.indexOf(Environment.PWD.$()));
+  }
 }


Mime
View raw message