hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zhixuan Zhu" <z...@calpont.com>
Subject RE: how to pass a hdfs file to a c++ process
Date Tue, 23 Aug 2011 15:51:20 GMT
Thank you very much!

'hadoop fs -cat <file> | mylegacyexe' is exactly the kind of method I
came up with and was going to try it out. I'm glad to hear that it's
actually an "official" alternative. 

Thanks again. This is a great forum!
Grace


-----Original Message-----
From: Arun Murthy [mailto:acm@hortonworks.com] 
Sent: Tuesday, August 23, 2011 10:36 AM
To: common-dev@hadoop.apache.org
Subject: Re: how to pass a hdfs file to a c++ process

That is a normal use case.

I'd encourage you to use Java MR (even pig/hive).

If you really want to use your legacy app use streaming with a map cmd
such as 'hadoop fs -cat <file> | mylegacyexe'

Arun

Sent from my iPhone

On Aug 23, 2011, at 8:00 AM, Zhixuan Zhu <zzhu@calpont.com> wrote:

> I'll actually invoke one executable from each of my map. Because this
> C++ program has been implemented and used in the past, I just want to
> integrate it to our Hadoop map/reduce without having to re-implement
the
> process in java. So my map is going to be very simple with just
calling
> the process and pass the input files.
>
> Thanks,
> Grace
>
> -----Original Message-----
> From: Arun C Murthy [mailto:acm@hortonworks.com]
> Sent: Tuesday, August 23, 2011 9:51 AM
> To: common-dev@hadoop.apache.org
> Subject: Re: how to pass a hdfs file to a c++ process
>
>
> On Aug 22, 2011, at 12:57 PM, Zhixuan Zhu wrote:
>
>> Hi All,
>>
>> I'm using hadoop-0.20.2 to try out some simple tasks. I asked a
> question
>> about FileInputFormat a few days ago and get some prompt replys from
>> this forum and it helped a lot. Thanks again! Now I have another
>> question. I'm trying to invoke a C++ process from my mapper for each
>> hdfs file in the input directory to achieve some parallel processing.
>
> That seems weird - why aren't you using more maps and one file
per-map?
>
>> But how do I pass the file to the program? I would want to do
> something
>> like the following in my mapper:
>
> IAC, libhdfs is one way to do HDFS ops via c/c++.
>
> Arun
>
>>
>> Process lChldProc = Runtime.getRuntime().exec("myprocess -file
>> $filepath");
>>
>> How do I pass the hdfs filesystem to an outside process like that? Is
>> HadoopStreaming the direction I should go?
>>
>> Thanks very much for any reply in advance.
>>
>> Best,
>> Grace
>

Mime
View raw message