hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ankit Jain <ankitjainc...@gmail.com>
Subject Re: Lzo Compression
Date Wed, 27 Jul 2011 13:37:04 GMT
Hi all,
I tried to index the lzo file but got the following error while indexing the
lzo file :

java.lang.ClassCastException:
com.hadoop.compression.lzo.LzopCodec$LzopDecompressor cannot be cast to
com.hadoop.compression.lzo.LzopDecompressor

I have performed following steps:-

1. $sudo apt-get install liblzo2-dev
2. Download the Hadoop-lzo clone from github repo (
https://github.com/kevinweil/hadoop-lzo )
3. Build the Hadoop-lzo project.
4. Copy the hadoop-lzo-*.jar file at $HADOOP_HOME/lib dir of cluster nodes
5. Copy the hadoop-lzo-install-dir/build/hadoop-lzo-*/native library at
$HADOOP_HOME/lib dir of cluster nodes.
6. core-site.xml:
            <property>
            <name>io.compression.codecs</name>
        <value>org.apache.hadoop.io.compress.GzipCodec
            ,org.apache.hadoop.io.compress.DefaultCodec,
            com.hadoop.compression.lzo.LzoCodec,
            com.hadoop.compression.lzo.LzopCodec,
            org.apache.hadoop.io.compress.BZip2Codec
        </value>
      </property>
      <property>
            <name>io.compression.codec.lzo.class</name>
            <value>com.hadoop.compression.lzo.LzoCodec</value>
      </property>
7. mapred-site.xml:

    <property>
                    <name>mapred.child.env</name>

<value>JAVA_LIBRARY_PATH=/opt/ladap/common/hadoop-0.20.2/lib/native/Linux-i386-32/*</value>
            </property>

          <property>
                <name>mapred.map.output.compression.codec</name>
                <value>com.hadoop.compression.lzo.LzoCodec</value>
          </property>
8. hadoop-env.sh

export
HADOOP_CLASSPATH=$HADOOP_CLASSPATH:/home/ankit/hadoop-0.20.1/lib/hadoop-lzo-0.4.12.jar
    export
JAVA_LIBRARY_PATH=/home/ankit/hadoop-0.20.1/lib/native/Linux-i386-32/

9. Restart the cluster

10. uploaded lzo file into hdfs

11. Runned the following command for indexing:
bin/hadoop jar path/to/hadoop-lzo-*.jar
com.hadoop.compression.lzo.LzoIndexer lzofile.lzo



On Tue, Jul 26, 2011 at 1:39 PM, Koert Kuipers <koert@tresata.com> wrote:

> my installation notes for lzo-hadoop (might be wrong or incomplete):
>
> we run centos 5.6 and cdh3
>
> yum -y install lzo
> git checkout https://github.com/toddlipcon/hadoop-lzo.git
> cd hadoop-lzo
> ant
> cd build
> cp hadoop-lzo-0.4.10/hadoop-lzo-0.4.10.jar /usr/lib/hadoop/lib
> cp -r hadoop-lzo-0.4.10/lib/native /usr/lib/hadoop/lib
>
>
> in core.site.xml:
>  <property>
>     <name>io.compression.codecs</name>
>
> <value>org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.DefaultCodec,com.hadoop.compression.lzo.LzoCodec,com.hadoop.compression.lzo.LzopCodec,org.apache.hadoop.io.compress.BZip2Codec</value>
>     <final>true</final>
>   </property>
>
>   <property>
>     <name>io.compression.codec.lzo.class</name>
>     <value>com.hadoop.compression.lzo.LzoCodec</value>
>     <final>true</final>
>   </property>
>
>
> in mapred.site.xml:
>   <property>
>     <name>mapred.compress.map.output</name>
>     <value>true</value>
>     <final>false</final>
>   </property>
>
>   <property>
>     <name>mapred.map.output.compression.codec</name>
>     <value>com.hadoop.compression.lzo.LzoCodec</value>
>     <final>false</final>
>   </property>
>
>   <property>
>     <name>mapred.output.compress</name>
>     <value>true</value>
>     <final>false</final>
>   </property>
>
>   <property>
>     <name>mapred.output.compression.codec</name>
>     <value>com.hadoop.compression.lzo.LzoCodec</value>
>     <final>false</final>
>   </property>
>
>   <property>
>     <name>mapred.output.compression.type</name>
>     <value>BLOCK</value>
>     <final>false</final>
>   </property>
>
>
> On Mon, Jul 25, 2011 at 7:14 PM, Alejandro Abdelnur <tucu@cloudera.com>wrote:
>
>> Vikas,
>>
>> You should be able to use the Snappy codec doing some minor tweaks
>> from http://code.google.com/p/hadoop-snappy/ until a Hadoop releases
>> with Snappy support.
>>
>> Thxs.
>>
>> Alejandro.
>>
>> On Mon, Jul 25, 2011 at 4:04 AM, Vikas Srivastava
>> <vikas.srivastava@one97.net> wrote:
>> > Hey ,
>> >
>> > i just want to use any compression in hadoop so i heard about lzo which
>> is
>> > best among all the compression (after snappy)
>> >
>> > please any1 tell me who is already using any kind of compression in
>> hadoop
>> > 0.20.2
>> >
>> >
>> >
>> > --
>> > With Regards
>> > Vikas Srivastava
>> >
>> > DWH & Analytics Team
>> > Mob:+91 9560885900
>> > One97 | Let's get talking !
>> >
>>
>
>

Mime
View raw message