hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jason hadoop <jason.had...@gmail.com>
Subject Re: Support for zipped input files
Date Tue, 10 Mar 2009 04:11:32 GMT
Hadoop has support for S3, the compression support is handled at another
level and should also work.

On Mon, Mar 9, 2009 at 9:05 PM, Ken Weiner <ken@gumgum.com> wrote:

> I have a lot of large zipped (not gzipped) files sitting in an Amazon S3
> bucket that I want to process.  What is the easiest way to process them
> with
> a Hadoop map-reduce job?  Do I need to write code to transfer them out of
> S3, unzip them, and then move them to HDFS before running my job, or does
> Hadoop have support for processing zipped input files directly from S3?

Alpha Chapters of my book on Hadoop are available

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message