hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zhixuan Zhu" <z...@calpont.com>
Subject RE: how to pass a hdfs file to a c++ process
Date Tue, 23 Aug 2011 14:59:28 GMT
I'll actually invoke one executable from each of my map. Because this
C++ program has been implemented and used in the past, I just want to
integrate it to our Hadoop map/reduce without having to re-implement the
process in java. So my map is going to be very simple with just calling
the process and pass the input files.


-----Original Message-----
From: Arun C Murthy [mailto:acm@hortonworks.com] 
Sent: Tuesday, August 23, 2011 9:51 AM
To: common-dev@hadoop.apache.org
Subject: Re: how to pass a hdfs file to a c++ process

On Aug 22, 2011, at 12:57 PM, Zhixuan Zhu wrote:

> Hi All,
> I'm using hadoop-0.20.2 to try out some simple tasks. I asked a
> about FileInputFormat a few days ago and get some prompt replys from
> this forum and it helped a lot. Thanks again! Now I have another
> question. I'm trying to invoke a C++ process from my mapper for each
> hdfs file in the input directory to achieve some parallel processing.

That seems weird - why aren't you using more maps and one file per-map?

> But how do I pass the file to the program? I would want to do
> like the following in my mapper:

IAC, libhdfs is one way to do HDFS ops via c/c++.


> Process lChldProc = Runtime.getRuntime().exec("myprocess -file
> $filepath");
> How do I pass the hdfs filesystem to an outside process like that? Is
> HadoopStreaming the direction I should go?
> Thanks very much for any reply in advance.
> Best,
> Grace

View raw message