hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bobby Dennett <bdenn...@gmail.com>
Subject Re: Enabling LZO compression of map outputs in Cloudera Hadoop 0.20.1
Date Thu, 05 Aug 2010 23:52:39 GMT
Hi Josh,

No real pain points... just trying to investigate/research the "best"
way to create the necessary libraries and jar files to support LZO
compression in Hadoop. In particular, there are the 2 "repositories"
to build from and I am trying to find out if one should be used over
the other. For instance, in your previous posting, you refer to
hadoop-gpl-compression while the Twitter blog post from last year
mentions the Hadoop-LZO project. Briefly looking, it seems Hadoop-LZO
is preferable but we're curious if there are any caveats/gotchas we
should be aware of.


On Thu, Aug 5, 2010 at 15:59, Josh Patterson <josh@cloudera.com> wrote:
> Bobby,
> We're working hard to make compression easier, the biggest hurdle
> currently is the licensing issues around the LZO codec libs (GPL,
> which is not compatible with ASF bsd-style license).
> Outside of making the changes to the mapred-site.xml file, with your
> setup would do you view as the biggest pain point?
> Josh Patterson
> Cloudera
> On Thu, Aug 5, 2010 at 6:52 PM, Bobby Dennett
> <bdennett+software@gmail.com> wrote:
>> We are looking to enable LZO compression of the map outputs on our
>> Cloudera 0.20.1 cluster. It seems there are various sets of
>> instructions available and I am curious what your thoughts are
>> regarding which one would be best for our Hadoop distribution and OS
>> (Ubuntu 8.04 64-bit). In particular, hadoop-gpl-compression
>> (http://code.google.com/p/hadoop-gpl-compression) vs. hadoop-lzo
>> (http://github.com/kevinweil/hadoop-lzo).
>> Some of what appear to be the better instructions/guides out there:
>> * Josh Patterson's reply on June 25th to the "Newbie to HDFS
>> compression" thread --
>> http://mail-archives.apache.org/mod_mbox/hadoop-common-user/201006.mbox/%3CAANLkTileo-q8USEiP8Y3Na9pDYHlyUFIPpR0In0LkpJm@mail.gmail.com%3E
>> * hadoop-gpl-compression FAQ --
>> http://code.google.com/p/hadoop-gpl-compression/wiki/FAQ
>> * "Hadoop at Twitter (part 1): Splittable LZO Compression" blog post
>> -- http://www.cloudera.com/blog/2009/11/hadoop-at-twitter-part-1-splittable-lzo-compression/
>> Thanks in advance,
>> -Bobby

View raw message