hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sonal Goyal <sonalgoy...@gmail.com>
Subject Re: DBOutputFormat over SSH?
Date Fri, 23 Apr 2010 04:27:44 GMT
Hi Gautam,

DBOutputFormat inserts records one by one, which is inherently slow. You can
use open source Apache licensed hiho framework which provides MySQL's "load
data infile " functionality.  It may be more suited to your needs.

HIHO is available at http://code.google.com/p/hiho/

I havent tested it over ssh, please let me know if you need any help setting
it up.

Thanks and Regards,

On Fri, Apr 23, 2010 at 9:48 AM, Gautam Singaraju <
gautam.singaraju@gmail.com> wrote:

> All,
> I have a use-case where I need to crunch a large amount of data and
> push to the results (comparatively a smaller set) to a mysql db at a
> remote location. As per security concerns, only SSH ports are open. I
> tried using Java Secure Channel [1] in combination with some custom
> JDBC code from the reducers.
> Can anyone comment on the performance of DBOutputFormat? Have there
> been any efforts to tunnel this through SSH? This is going to be an
> expensive operation; any suggestions would be welcome.
> [1] http://www.jcraft.com/jsch/
> ---
> Gautam Singaraju

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message