hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eric Sammer <esam...@cloudera.com>
Subject Re: Does Hadoop compress files?
Date Sun, 04 Apr 2010 08:46:22 GMT
To clarify, there is no implicit compression in HDFS. In other words,
if you want your data to be compressed, you have to write it that way.
If you plan on writing map reduce jobs to process the compressed data,
you'll want to use a splittable compression format. This generally
means LZO or block compressed SequenceFiles which others have

On Sat, Apr 3, 2010 at 10:45 AM, u235sentinel <u235sentinel@gmail.com> wrote:
> I'm starting to evaluate Hadoop.  We are currently running Sensage and store
> a lot of log files in our current environment.  I've been looking at the
> Hadoop forums and googling (of course) but haven't learned if Hadoop HDFS
> does any compression to files we store.
> On the average we're storing about 600 gigs a week in log files (more or
> less).  Generally we need to store about 1 1/2 - 2 years of logs.  With
> Sensage compression we can store about 200+ Tb of logs in our current
> environment.
> As I said, we're starting to evaluate if Hadoop would be a good replacement
> to our Sensage environment (or at least augment it).
> Thanks a bunch!!

Eric Sammer
phone: +1-917-287-2675
twitter: esammer
data: www.cloudera.com

View raw message