hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Forsberg <forsb...@opera.com>
Subject Copy files https -> HDFS
Date Tue, 07 Jul 2009 11:02:02 GMT
Hi!

I have a list of files that reside on an https server (which require
authentication, either username/password or a client certificate), which
I want to copy into HDFS for later Map/Reduce processing. It's a bunch
of rather large files, so I'd like to do it in parallel.

I would guess this has been done before? Is there example code
anywhere? I can imagine creating a mapper-only job with a list of files
as input, but how do I easily write to HDFS from a mapper? 

Thanks,
\EF

Mime
View raw message