hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <ha...@cloudera.com>
Subject Re: executing files on hdfs via hadoop not possible? is JNI/JNA a reasonable solution?
Date Sun, 17 Mar 2013 09:50:34 GMT
You're confusing two things here. HDFS is a data storage filesystem.
MR does not have anything to do with HDFS (generally speaking).

A reducer runs as a regular JVM on a provided node, and can execute
any program you'd like it to by downloading it onto its configured
local filesystem and executing it.

If your goal is merely to run a regular program over data that is
sitting in HDFS, that can be achieved. If your library is in C then
simply use a streaming program to run it and use libhdfs' HDFS API
(C/C++) to read data into your functions from HDFS files. Would this
not suffice?

On Sun, Mar 17, 2013 at 3:09 PM, Julian Bui <julianbui@gmail.com> wrote:
> Hi hadoop users,
> I just want to verify that there is no way to put a binary on HDFS and
> execute it using the hadoop java api.  If not, I would appreciate advice in
> getting in creating an implementation that uses native libraries.
> "In contrast to the POSIX model, there are no sticky, setuid or setgid bits
> for files as there is no notion of executable files."  Is there no
> workaround?
> A little bit more about what I'm trying to do.  I have a binary that
> converts my image to another image format.  I currently want to put it in
> the distributed cache and tell the reducer to execute the binary on the data
> on hdfs.  However, since I can't set the execute permission bit on that
> file, it seems that I cannot do that.
> Since I cannot use the binary, it seems like I have to use my own
> implementation to do this.  The challenge is that these libraries that I can
> use to do this are .a and .so files.  Would I have to use JNI and package
> the libraries in the distributed cache and then have the reducer find and
> use those libraries on the task nodes?  Actually, I wouldn't want to use
> JNI, I'd probably want to use java native access (JNA) to do this.  Has
> anyone used JNA with hadoop and been successful?  Are there problems I'll
> encounter?
> Please let me know.
> Thanks,
> -Julian

Harsh J

View raw message