hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mohammad Tariq <donta...@gmail.com>
Subject Re: Calling C inside MR
Date Mon, 03 Dec 2012 13:03:16 GMT
Thank you so much Bertrand for the quick response.

One quick question, would it affect the MR performance?? I mean, if I write
a MR to do something and write another MR for the same task, but instead of
writing the entire processing logic as part of my MR job, the corresponding
'C' module will be called in the second MR. Will there be a lot of
difference between the MRs (performance or otherwise) ??

Thanks again.

    Mohammad Tariq

On Mon, Dec 3, 2012 at 6:24 PM, Bertrand Dechoux <dechouxb@gmail.com> wrote:

> You provided the answer, JNI is a solution. Another one would be to use
> hadoop streaming if your program can read stdin and write into stdout with
> a good enough format.
> A MR job is, in the end, plain java and does not impact how java can call
> external process.
> Bertrand
> On Mon, Dec 3, 2012 at 1:05 PM, Mohammad Tariq <dontariq@gmail.com> wrote:
>> Hello list,
>>           I have a tool (written in C) that performs some different types
>> of operations and can be used as a command line utility. I had to write a
>> similar tool, as we have moved towards Hadoop platform for most of the
>> things.
>> Till now I have taken this tool as reference  and written MR jobs
>> corresponding to some the modules of this tool and they are working fine.
>> But I am wasting a lot of time in this. So, I just wanted to ask if it is
>> possible to call this tool through a MR job?? Somewhat like JNI kinda
>> thing. (I hope it is, otherwise I have to write rest of things from scratch
>> and we are running out of time).
>> Many thanks.
>> Regards,
>>     Mohammad Tariq
> --
> Bertrand Dechoux

View raw message