hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eric Sammer <esam...@cloudera.com>
Subject Re: DBOutputFormat over SSH?
Date Fri, 23 Apr 2010 04:29:00 GMT
In general, you'll want to avoid tunneling permanent production code
over ssh tunnels. They're flaky and do not recover from network
interruption in any reasonable way. If you need to do this, a vpn is
the correct approach. Linux easily will do ipsec p2p tunnels that are
reasonably secure. If you really only have port 22 then I suppose
that's your only option but I really would reevaluate the security

Either way, it's going to be slow due to the encryption overhead but
if it's a small amount of data, that may be fine.

On Fri, Apr 23, 2010 at 12:18 AM, Gautam Singaraju
<gautam.singaraju@gmail.com> wrote:
> All,
> I have a use-case where I need to crunch a large amount of data and
> push to the results (comparatively a smaller set) to a mysql db at a
> remote location. As per security concerns, only SSH ports are open. I
> tried using Java Secure Channel [1] in combination with some custom
> JDBC code from the reducers.
> Can anyone comment on the performance of DBOutputFormat? Have there
> been any efforts to tunnel this through SSH? This is going to be an
> expensive operation; any suggestions would be welcome.
> [1] http://www.jcraft.com/jsch/
> ---
> Gautam Singaraju

Eric Sammer
phone: +1-917-287-2675
twitter: esammer
data: www.cloudera.com

View raw message