hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gautam Singaraju <gautam.singar...@gmail.com>
Subject Re: DBOutputFormat over SSH?
Date Fri, 23 Apr 2010 04:58:41 GMT
Thanks Sonal, I will check it out.
---
Gautam



On Fri, Apr 23, 2010 at 12:27 AM, Sonal Goyal <sonalgoyal4@gmail.com> wrote:
> Hi Gautam,
>
> DBOutputFormat inserts records one by one, which is inherently slow. You can
> use open source Apache licensed hiho framework which provides MySQL's "load
> data infile " functionality.  It may be more suited to your needs.
>
> HIHO is available at http://code.google.com/p/hiho/
>
> I havent tested it over ssh, please let me know if you need any help setting
> it up.
>
> Thanks and Regards,
> Sonal
> www.meghsoft.com
>
>
> On Fri, Apr 23, 2010 at 9:48 AM, Gautam Singaraju <
> gautam.singaraju@gmail.com> wrote:
>
>> All,
>>
>> I have a use-case where I need to crunch a large amount of data and
>> push to the results (comparatively a smaller set) to a mysql db at a
>> remote location. As per security concerns, only SSH ports are open. I
>> tried using Java Secure Channel [1] in combination with some custom
>> JDBC code from the reducers.
>>
>> Can anyone comment on the performance of DBOutputFormat? Have there
>> been any efforts to tunnel this through SSH? This is going to be an
>> expensive operation; any suggestions would be welcome.
>>
>> [1] http://www.jcraft.com/jsch/
>> ---
>> Gautam Singaraju
>>
>

Mime
View raw message