Return-Path: X-Original-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id E113FC8E8 for ; Fri, 7 Mar 2014 14:24:15 +0000 (UTC) Received: (qmail 9131 invoked by uid 500); 7 Mar 2014 14:24:07 -0000 Delivered-To: apmail-hadoop-hdfs-user-archive@hadoop.apache.org Received: (qmail 7976 invoked by uid 500); 7 Mar 2014 14:24:05 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 7968 invoked by uid 99); 7 Mar 2014 14:24:03 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 07 Mar 2014 14:24:03 +0000 X-ASF-Spam-Status: No, hits=-2.8 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_HI,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [211.189.100.14] (HELO usmailout4.samsung.com) (211.189.100.14) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 07 Mar 2014 14:23:58 +0000 Received: from uscpsbgm1.samsung.com (u114.gpu85.samsung.co.kr [203.254.195.114]) by usmailout4.samsung.com (Oracle Communications Messaging Server 7u4-24.01(7.0.4.24.0) 64bit (built Nov 17 2011)) with ESMTP id <0N2200B7JLB4F220@usmailout4.samsung.com> for user@hadoop.apache.org; Fri, 07 Mar 2014 09:23:28 -0500 (EST) X-AuditID: cbfec372-b7fa96d000006a7b-80-5319d65fdd4e Received: from ussync1.samsung.com ( [203.254.195.81]) by uscpsbgm1.samsung.com (USCPMTA) with SMTP id 3A.6D.27259.F56D9135; Fri, 07 Mar 2014 09:23:27 -0500 (EST) Received: from lgflarrahondo ([105.140.33.168]) by ussync1.samsung.com (Oracle Communications Messaging Server 7u4-23.01 (7.0.4.23.0) 64bit (built Aug 10 2011)) with ESMTPA id <0N22003P5LB24C60@ussync1.samsung.com> for user@hadoop.apache.org; Fri, 07 Mar 2014 09:23:27 -0500 (EST) From: German Florez-Larrahondo To: user@hadoop.apache.org References: In-reply-to: Subject: RE: MR2 Job over LZO data Date: Fri, 07 Mar 2014 08:23:26 -0600 Message-id: <007c01cf3a10$ce800d60$6b802820$@samsung.com> MIME-version: 1.0 Content-type: multipart/alternative; boundary="----=_NextPart_000_007D_01CF39DE.83E6FCF0" X-Mailer: Microsoft Outlook 14.0 Thread-index: AQHoiH/0FjZwBUhy/+PBRNdavIKXdwJ8/K1smo8nyyA= Content-language: en-us X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFtrNLMWRmVeSWpSXmKPExsVy+t/hQN34a5LBBjNW6Fj0TJnG4sDoMaFr C2MAYxSXTUpqTmZZapG+XQJXxpXpb1kK3q5nrPiyeyJbA+PE+YxdjJwcEgImEkcWTWCGsMUk Ltxbz9bFyMUhJLCEUeLnqjcsEM4CJon+e2fBOtgEzCR+dzSAdYgISEl0v5nMBGILCcxklFjQ 6gpicwoES3xbuYG9i5GDQ1hASeLzt2qQMIuAqsSTNVvBynkFLCUurLnBBmELSvyYfI8FpJxZ IFpi0UkHiHsUJHacfc0IsclKYtKCqawgNrOAuMSkBw/ZJzAKzELSPQuhexaSKoiwnkTbRkaI sLzE9rdzmCFsXYn/z2FsbYllC18zL2BkX8UoWlqcXFCclJ5rqFecmFtcmpeul5yfu4kREuBF OxifbbA6xCjAwajEw2uwQCJYiDWxrLgy9xCjBAezkggvz2rJYCHelMTKqtSi/Pii0pzU4kOM TBycUg2MXLfkerQ2Wx+dWRiab1O0692OrlrjH4vuCj7c/5Zbp2hyz1kh2YVd9t/+3fxnI78r J+XY2jMLSz+e/Dl7VnMv0/Fy+VDOTI9HOZe/z67R+h0yYfdSPzXZK6ks0bL3fubXZPl8yjmu 1/Zl3eEpN1QNbY7FnbxonzJxibuLXGP8zQ1P3srWulq0K7EUZyQaajEXFScCAHtkaGtOAgAA X-Virus-Checked: Checked by ClamAV on apache.org This is a multipart message in MIME format. ------=_NextPart_000_007D_01CF39DE.83E6FCF0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit King Here is my raw log of installing Hadoop LZO. This works on 2.2.0 and 2.3.0 I hope this helps ./g Where to get Hadoop LZO https://github.com/twitter/hadoop-lzo http://asmarterplanet.com/studentsfor/blog/2013/11/hadoop-cluster-module-lzo -compression.html Requirements On cents: sudo yum install lzo* --> /usr/lib64/liblzo2.so.2.. On ubuntu: sudo apt-get install liblzo --> on X86: /usr/lib64/liblzo2.so.2 Clone: git clone https://github.com/twitter/hadoop-lzo.git Follow instructions on README.md from this github site, basically cd hadoop-lzo mvn clean package test To enable this at run time do: a. Copy the library to the hadoop/share/common (if you don't want to modify classpaths by putting the library somewhere else) cp lzo././target/hadoop-lzo-0.4.20-SNAPSHOT.jar .. hadoop/share/hadoop/common/ a. Copy /usr/lib64/liblzo2.so.2 to .. Hadoop/lib/native/ From: Gordon Wang [mailto:gwang@gopivotal.com] Sent: Thursday, March 06, 2014 11:50 PM To: user@hadoop.apache.org Subject: Re: MR2 Job over LZO data You can try to get the source code https://github.com/twitter/hadoop-lzo and then compile it against hadoop 2.2.0. In my memory, as long as rebuild it, lzo should work with hadoop 2.2.0 On Thu, Mar 6, 2014 at 6:29 PM, KingDavies wrote: Running on Hadoop 2.2.0 The Java MR2 job works as expected on an uncompressed data source using the TextInputFormat.class. But when using the LZO format the job fails: import com.hadoop.mapreduce.LzoTextInputFormat; job.setInputFormatClass(LzoTextInputFormat.class); Dependencies from the maven repository: http://maven.twttr.com/com/hadoop/gplcompression/hadoop-lzo/0.4.19/ Also tried with elephant-bird-core 4.4 The same data can be queried fine from within Hive(0.12) on the same cluster. The exception: Exception in thread "main" java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected at com.hadoop.mapreduce.LzoTextInputFormat.listStatus(LzoTextInputFormat.java:6 2) at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFor mat.java:340) at com.hadoop.mapreduce.LzoTextInputFormat.getSplits(LzoTextInputFormat.java:10 1) at org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:49 1) at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:508) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java :392) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1268) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1265) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.ja va:1491) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1265) at com.cloudreach.DataQuality.Main.main(Main.java:42) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57 ) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl .java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) I believe the issue is related to the changes in Hadoop 2, but where can I find a H2 compatible version? Thanks -- Regards Gordon Wang ------=_NextPart_000_007D_01CF39DE.83E6FCF0 Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable

King

Here is my raw log of installing Hadoop LZO. This works on 2.2.0 and = 2.3.0

 

I hope this helps

 

./g

 

 

Where to = get Hadoop LZO=

https://github.com/twitter= /hadoop-lzo

 =

http://asmarterplanet.com/studentsfor/blog/= 2013/11/hadoop-cluster-module-lzo-compression.html<= /p>

 =

Requirement= s

On = cents:

sudo yum = install lzo*  --> = /usr/lib64/liblzo2.so.2….

 =

On ubuntu: =

sudo = apt-get install liblzo -->  on X86:  = /usr/lib64/liblzo2.so.2  

 =

Clone:=

git clone = https://github.com/twi= tter/hadoop-lzo.git

 =

Follow = instructions on README.md from this github site, = basically

 =

 cd = hadoop-lzo

=      mvn clean package  = test

 =

To enable = this at run time do:

a.       = Copy the = library to the hadoop/share/common (if  you don’t want to = modify classpaths by putting the library somewhere = else)

 =

cp = lzo…/./target/hadoop-lzo-0.4.20-SNAPSHOT.jar  .. = hadoop/share/hadoop/common/

 =

a.       = Copy = /usr/lib64/liblzo2.so.2 to  .. = Hadoop/lib/native/

 

 

From:= = Gordon Wang [mailto:gwang@gopivotal.com]
Sent: Thursday, = March 06, 2014 11:50 PM
To: = user@hadoop.apache.org
Subject: Re: MR2 Job over LZO = data

 

You can = try to get the source code https://github.com/twitter= /hadoop-lzo  and then compile it against hadoop = 2.2.0.

 

In my memory, as long as rebuild it, lzo should work = with hadoop 2.2.0

 

On Thu, Mar 6, 2014 at 6:29 PM, KingDavies <kingdavies@gmail.com> = wrote:

Running on Hadoop = 2.2.0

 

The Java MR2 job works as expected on an uncompressed = data source using the TextInputFormat.class.

But = when using the LZO format the job = fails:

import = com.hadoop.mapreduce.LzoTextInputFormat;

job.setInputFormatClass(LzoTextInputFormat.class);

 

Dependencies from the maven = repository:

http://maven.twttr.com/com/hadoop/gplcompression/hadoop= -lzo/0.4.19/

Also = tried with elephant-bird-core 4.4

 

The same data can be queried fine from within = Hive(0.12) on the same cluster.

 

 

The exception:

Exception in thread "main" = java.lang.IncompatibleClassChangeError: Found interface = org.apache.hadoop.mapreduce.JobContext, but class was = expected

at = com.hadoop.mapreduce.LzoTextInputFormat.listStatus(LzoTextInputFormat.jav= a:62)

at = org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInput= Format.java:340)

at = com.hadoop.mapreduce.LzoTextInputFormat.getSplits(LzoTextInputFormat.java= :101)

at = org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java= :491)

at = org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:50= 8)

at = org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.j= ava:392)

at = org.apache.hadoop.mapreduce.Job$10.run(Job.java:1268)

at = org.apache.hadoop.mapreduce.Job$10.run(Job.java:1265)

at = java.security.AccessController.doPrivileged(Native = Method)

at = javax.security.auth.Subject.doAs(Subject.java:415)

<= /div>

at = org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation= .java:1491)

at = org.apache.hadoop.mapreduce.Job.submit(Job.java:1265)

at = com.cloudreach.DataQuality.Main.main(Main.java:42)

<= /div>

at = sun.reflect.NativeMethodAccessorImpl.invoke0(Native = Method)

at = sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java= :57)

at = sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorI= mpl.java:43)

at = java.lang.reflect.Method.invoke(Method.java:606)

at = org.apache.hadoop.util.RunJar.main(RunJar.java:212)

=

 

I believe the issue is = related to the changes in Hadoop 2, but where can I find a H2 = compatible version?

 

Thanks



 

-- =

Regards

Gordon = Wang

------=_NextPart_000_007D_01CF39DE.83E6FCF0--