hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gautam Singaraju <gautam.singar...@gmail.com>
Subject Re: DBOutputFormat over SSH?
Date Fri, 23 Apr 2010 05:02:05 GMT
Then there is a need to design a local differential DB and synchronize
the database with the remote location. That synchronization could be
over SSH and should be atomic.

On Fri, Apr 23, 2010 at 12:29 AM, Eric Sammer <esammer@cloudera.com> wrote:
> In general, you'll want to avoid tunneling permanent production code
> over ssh tunnels. They're flaky and do not recover from network
> interruption in any reasonable way. If you need to do this, a vpn is
> the correct approach. Linux easily will do ipsec p2p tunnels that are
> reasonably secure. If you really only have port 22 then I suppose
> that's your only option but I really would reevaluate the security
> policy.
> Either way, it's going to be slow due to the encryption overhead but
> if it's a small amount of data, that may be fine.
> On Fri, Apr 23, 2010 at 12:18 AM, Gautam Singaraju
> <gautam.singaraju@gmail.com> wrote:
>> All,
>> I have a use-case where I need to crunch a large amount of data and
>> push to the results (comparatively a smaller set) to a mysql db at a
>> remote location. As per security concerns, only SSH ports are open. I
>> tried using Java Secure Channel [1] in combination with some custom
>> JDBC code from the reducers.
>> Can anyone comment on the performance of DBOutputFormat? Have there
>> been any efforts to tunnel this through SSH? This is going to be an
>> expensive operation; any suggestions would be welcome.
>> [1] http://www.jcraft.com/jsch/
>> ---
>> Gautam Singaraju
> --
> Eric Sammer
> phone: +1-917-287-2675
> twitter: esammer
> data: www.cloudera.com

View raw message