hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From German Florez-Larrahondo <german...@samsung.com>
Subject RE: MR2 Job over LZO data
Date Fri, 07 Mar 2014 14:23:26 GMT
King

Here is my raw log of installing Hadoop LZO. This works on 2.2.0 and 2.3.0

 

I hope this helps

 

./g

 

 

Where to get Hadoop LZO

https://github.com/twitter/hadoop-lzo

 

http://asmarterplanet.com/studentsfor/blog/2013/11/hadoop-cluster-module-lzo
-compression.html

 

Requirements

On cents:

sudo yum install lzo*  --> /usr/lib64/liblzo2.so.2..

 

On ubuntu: 

sudo apt-get install liblzo -->  on X86:  /usr/lib64/liblzo2.so.2   

 

Clone:

git clone https://github.com/twitter/hadoop-lzo.git

 

Follow instructions on README.md from this github site, basically

 

 cd hadoop-lzo

     mvn clean package  test

 

To enable this at run time do:

a.       Copy the library to the hadoop/share/common (if  you don't want to
modify classpaths by putting the library somewhere else)

 

cp lzo././target/hadoop-lzo-0.4.20-SNAPSHOT.jar  ..
hadoop/share/hadoop/common/

 

a.       Copy /usr/lib64/liblzo2.so.2 to  .. Hadoop/lib/native/

 

 

From: Gordon Wang [mailto:gwang@gopivotal.com] 
Sent: Thursday, March 06, 2014 11:50 PM
To: user@hadoop.apache.org
Subject: Re: MR2 Job over LZO data

 

You can try to get the source code https://github.com/twitter/hadoop-lzo
and then compile it against hadoop 2.2.0.

 

In my memory, as long as rebuild it, lzo should work with hadoop 2.2.0

 

On Thu, Mar 6, 2014 at 6:29 PM, KingDavies <kingdavies@gmail.com> wrote:

Running on Hadoop 2.2.0

 

The Java MR2 job works as expected on an uncompressed data source using the
TextInputFormat.class.

But when using the LZO format the job fails:

import com.hadoop.mapreduce.LzoTextInputFormat;

job.setInputFormatClass(LzoTextInputFormat.class);

 

Dependencies from the maven repository:

http://maven.twttr.com/com/hadoop/gplcompression/hadoop-lzo/0.4.19/

Also tried with elephant-bird-core 4.4

 

The same data can be queried fine from within Hive(0.12) on the same
cluster.

 

 

The exception:

Exception in thread "main" java.lang.IncompatibleClassChangeError: Found
interface org.apache.hadoop.mapreduce.JobContext, but class was expected

at
com.hadoop.mapreduce.LzoTextInputFormat.listStatus(LzoTextInputFormat.java:6
2)

at
org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFor
mat.java:340)

at
com.hadoop.mapreduce.LzoTextInputFormat.getSplits(LzoTextInputFormat.java:10
1)

at
org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:49
1)

at
org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:508)

at
org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java
:392)

at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1268)

at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1265)

at java.security.AccessController.doPrivileged(Native Method)

at javax.security.auth.Subject.doAs(Subject.java:415)

at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.ja
va:1491)

at org.apache.hadoop.mapreduce.Job.submit(Job.java:1265)

at com.cloudreach.DataQuality.Main.main(Main.java:42)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57
)

at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl
.java:43)

at java.lang.reflect.Method.invoke(Method.java:606)

at org.apache.hadoop.util.RunJar.main(RunJar.java:212)

 

I believe the issue is related to the changes in Hadoop 2, but where can I
find a H2 compatible version?

 

Thanks





 

-- 

Regards

Gordon Wang


Mime
View raw message