hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ahad Rana (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-4577) Add Jar "lib" directory to TaskRunner's library.path setting to allow JNI libraries to be deployed via JAR file
Date Thu, 06 Nov 2008 06:33:44 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-4577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12645408#action_12645408
] 

Ahad Rana commented on HADOOP-4577:
-----------------------------------

Hi Steve / Edward, 

I will look into the environment variable issue, although I deliberately left it out because
the name of the dynamic loader search path variable differs based on operating system, and
it thus potentially injects operating system awareness into the code (vs. at the script level,
where it is potentially more acceptable). In our usage scenario, we assume that any dependencies
that the deployed JNI lib may have are usually more stable in nature and thus can be pre-installed/pre-deployed
on the cluster into a directory already in the LD_LIBRARY_PATH, such as /usr/local/lib for
example.

Adding the platform identifier to the lib path is a valid suggestion. I am curious, do you
envision having a cluster deployment with a potential mixed set of operating system configurations
? In this scenario, you would definitely need a JAR with multiple operating system specific
versions of the JNI libraries. In our deployment example, we always build our jar on the cluster
(since we are deployed in a data center and transferring source code is way faster than transferring
jar files across DSL lines), and thus our build script properly identifies the host system
and builds an appropriate JNI library for the platform. But, if you feel the alternative of
properly qualifying the JNI access path by OS type is important, I will look into using PlatformName
utility under hadoop util to produce an appropriate path name.

> Add Jar "lib" directory to TaskRunner's library.path setting to allow JNI libraries to
be deployed via JAR file  
> -----------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-4577
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4577
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.18.1
>         Environment: Hadoop 18.1 Cluster with custom JNI shared libraries deployed in
lib directory of deployment JAR.
>            Reporter: Ahad Rana
>            Assignee: Ahad Rana
>             Fix For: 0.18.3
>
>         Attachments: HADOOP-4577-v1.patch
>
>
> It is extremely convenient to be able to deploy JNI libraries utilized in a custom map-reduce
job via the job's JAR file. The TaskRunner already establishes a precedent by automatically
adding any jar files contained in the "lib" directory of the job jar to the child map/reduce
process's classpath. Following this convention, it should also be possible to deploy custom
JNI libraries in the same lib directory. This involves adding the path to the job jar's lib
directory to the VM's library.path setting (after the jar has been expanded in the job cache
directory). This does not elimintate the need add dependent shared libraries that may be referenced
by the JNI libraries to the system's LD_LIBRARY_PATH variable. In our deployment configuration,
we usually pre-install third party shared libraries across the cluster and only deploy our
custom JNI libraries via the job jar.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message