hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ramya Sunil <ra...@hortonworks.com>
Subject Re: streaming cacheArchive shared libraries
Date Fri, 05 Aug 2011 17:44:37 GMT
Hi Keith,

I have tried the exact use case you have mentioned and it works fine for me.
Below is the command line for the same:

[ramya]$ jar vxf samplelib.jar
 created: META-INF/
 inflated: libhdfs.so

[ramya]$ hadoop dfs -put samplelib.jar samplelib.jar

[ramya]$ hadoop jar hadoop-streaming.jar -input InputDir -mapper "ls
testlink/libhdfs.so" -reducer NONE -output out -cacheArchive

[ramya]$ hadoop dfs -cat out/*

Hope it helps.


On 8/5/11 10:10 AM, "Keith Wiley" <kwiley@keithwiley.com> wrote:

I can use cacheFile to load .so files into the distributed cache and it
works fine (the streaming executable links against the .so and runs), but I
can't get it to work with -cacheArchive.  It always says it can't find the
.so file.  I realize that if you jar a directory, the directory will be
recreated when you unjar, but I've tried jaring a file directly.  It is
easily verified that unjarring such a file reproduces the original file as a
sibling of the jar file itself.  So it seems to me that cacheArchive should
have transferred the jar file to the cwd of my task, unjarred it, and
produced a .so file right there, but it doesn't link up with the executable.
 Like I said, I know this basic approach works just fine with cacheFile.

What could be the problem here?  I can't easily see the files on the cluster
since it is a remote cluster with limited access.  I don't believe I can ssh
to any individual machine to investigate the files that are created for a
task...but I think I have worked through the process logically and I'm not
sure what I'm doing wrong.


Keith Wiley     *kwiley@keithwiley.com*     keithwiley.com

"Luminous beings are we, not this crude matter."
                                           --  Yoda

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message