hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Arun C Murthy (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-1660) add support for native library toDistributedCache
Date Tue, 18 Dec 2007 20:05:43 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-1660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12552876
] 

Arun C Murthy commented on HADOOP-1660:
---------------------------------------

bq. It should be possible to specify using the DistributedCache what are the native libraries
a job needs.

How about something simpler... I propose we add the tasks's working directory to it's LD_LIBRARY_PATH,
then you could distribute your native libs with the DistributedCache and also have it symlink
them into it's cwd. Then things should work seamlessly. Thoughts?

> add support for native library toDistributedCache 
> --------------------------------------------------
>
>                 Key: HADOOP-1660
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1660
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>         Environment: unix (different handling would be required for windows)
>            Reporter: Alejandro Abdelnur
>
> Currently if a M/R job depends on JNI based component the dynamic library must be available
in all the task nodes. This is not possible specially when you have not control on the cluster
machines, just using it as a service.
> It should be possible to specify using the DistributedCache what are the native libraries
a job needs.
> For example via a new method 'public void addLibrary(Path libraryPath, JobConf conf)'.
> The added libraries would make it to the local FS of the task nodes (same way as cached
resources) but instead been part of the classpath they would be copied to a lib directory
and that lib directory would be added t the LD_LIBRARY_PATH of the task JVM.
> An alternative would be to set the '-Djava.library.path=' task JVM parameter to the lib
directory above. However, this would break for libraries that depend on other libraries as
the dependent one would not be in the LD_LIBRARY_PATH and the OS would fail to find it as
it is not the JVM the one doing the load of the dependent one.
> For uncached usage of native libraries, a special directory in the JAR could be used
for native libraries. But I'd argue that the DistributedCache enhancement would be enough,
and if somebody wants to use a native library s/he should use the DistributedCached. Or a
JobConf addLibrary method that uses the DistributedCached under the hood at submission time.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message