Return-Path: X-Original-To: apmail-spark-user-archive@minotaur.apache.org Delivered-To: apmail-spark-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A5541114BF for ; Thu, 18 Sep 2014 00:52:12 +0000 (UTC) Received: (qmail 28745 invoked by uid 500); 18 Sep 2014 00:52:11 -0000 Delivered-To: apmail-spark-user-archive@spark.apache.org Received: (qmail 28668 invoked by uid 500); 18 Sep 2014 00:52:11 -0000 Mailing-List: contact user-help@spark.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list user@spark.apache.org Received: (qmail 28657 invoked by uid 99); 18 Sep 2014 00:52:11 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 18 Sep 2014 00:52:10 +0000 X-ASF-Spam-Status: No, hits=0.6 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS,URI_HEX X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of vipandey@gmail.com designates 209.85.192.176 as permitted sender) Received: from [209.85.192.176] (HELO mail-pd0-f176.google.com) (209.85.192.176) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 18 Sep 2014 00:52:06 +0000 Received: by mail-pd0-f176.google.com with SMTP id g10so221209pdj.21 for ; Wed, 17 Sep 2014 17:51:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=content-type:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=+NpjTQwVxY2sAwGuvXL93Is3FwmSkeyHeTRg5lSz48U=; b=AN0rbFFBS3XUy+i+B1vqQVyWDox1d7t8QrI98fqYKvVCyV1Sko4x5peU1ihl1wUwlu qFnpGhPh4c6C1EXGWBG82vneQXl20WJuylCB1EwG6FZnccln8IfIIsZFlKUA7lzfxRTo 8RXJ1e+9CU2pFLE+RQN2plXzHpeByVmTrEcpRGSHgMMJTqQEMq+qJed6MS05IDA6s2p3 0qO9xC8YEi8Gark9UPgCS4d5WKT3Ps6SZv4eeZJWiVlgEr4JZtLyXha6aa6gA0oLw6Jm 02TejgF/nymgNNODJX1GnwrCa04wyqu1GKFz+gpoDoNofp+rC1xL77W5B6aKW8lyYWPQ hvvQ== X-Received: by 10.66.197.132 with SMTP id iu4mr964793pac.132.1411001505791; Wed, 17 Sep 2014 17:51:45 -0700 (PDT) Received: from [17.114.130.102] ([17.114.130.102]) by mx.google.com with ESMTPSA id iu10sm17990412pbd.57.2014.09.17.17.51.44 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Wed, 17 Sep 2014 17:51:45 -0700 (PDT) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 7.3 \(1878.6\)) Subject: Re: LZO support in Spark 1.0.0 - nothing seems to work From: Vipul Pandey In-Reply-To: <1411000826195-14494.post@n3.nabble.com> Date: Wed, 17 Sep 2014 17:51:42 -0700 Cc: user@spark.incubator.apache.org Content-Transfer-Encoding: quoted-printable Message-Id: <9F298CDE-9E28-4AA8-BC8B-D13D853ED833@gmail.com> References: <1411000826195-14494.post@n3.nabble.com> To: rogthefrog X-Mailer: Apple Mail (2.1878.6) X-Virus-Checked: Checked by ClamAV on apache.org It works for me :=20 export = JAVA_LIBRARY_PATH=3D$JAVA_LIBRARY_PATH:/opt/cloudera/parcels/HADOOP_LZO/li= b/hadoop/lib/native export = LD_LIBRARY_PATH=3D$LD_LIBRARY_PATH:/opt/cloudera/parcels/HADOOP_LZO/lib/ha= doop/lib/native export = SPARK_LIBRARY_PATH=3D$SPARK_LIBRARY_PATH:/opt/cloudera/parcels/HADOOP_LZO/= lib/hadoop/lib/native export = SPARK_CLASSPATH=3D$SPARK_CLASSPATH:/opt/cloudera/parcels/HADOOP_LZO/lib/ha= doop/lib/hadoop-lzo-cdh4-0.4.15-gplextras.jar I hope you are adding this to the code :=20 val conf =3D sc.hadoopConfiguration = conf.set("io.compression.codecs","com.hadoop.compression.lzo.LzopCodec") Vipul On Sep 17, 2014, at 5:40 PM, rogthefrog wrote: > I have a HDFS cluster managed with CDH Manager. Version is CDH 5.1 = with > matching GPLEXTRAS parcel. LZO works with Hive and Pig, but I can't = make it > work with Spark 1.0.0. I've tried: >=20 > * Setting this: >=20 > HADOOP_OPTS=3D"-Djava.net.preferIPv4Stack=3Dtrue $HADOOP_CLIENT_OPTS > = -Djava.library.path=3D/opt/cloudera/parcels/GPLEXTRAS/lib/hadoop/lib/nativ= e/" >=20 > * Setting this in spark-env.sh. I tried with and without "export". I = tried > in CDH Manager and manually on the host. >=20 > export > = SPARK_CLASSPATH=3D$SPARK_CLASSPATH:/opt/cloudera/parcels/GPLEXTRAS/lib/had= oop/lib/hadoop-lzo.jar > export > = SPARK_LIBRARY_PATH=3D$SPARK_LIBRARY_PATH:/opt/cloudera/parcels/GPLEXTRAS/l= ib/hadoop/lib/native/ >=20 > * Setting this in /etc/spark/conf/spark-defaults.conf: >=20 > spark.executor.extraLibraryPath=20 > /opt/cloudera/parcels/GPLEXTRAS/lib/hadoop/lib/native > spark.spark.executor.extraClassPath > /opt/cloudera/parcels/GPLEXTRAS/lib/hadoop/lib/hadoop-lzo.jar >=20 > * Adding this in CDH manager: >=20 > export = LD_LIBRARY_PATH=3D/opt/cloudera/parcels/GPLEXTRAS/lib/hadoop/lib/native >=20 > * Hardcoding > = -Djava.library.path=3D/opt/cloudera/parcels/GPLEXTRAS/lib/hadoop/lib/nativ= e in > the Spark command=20 >=20 > * Symlinking the gpl compression binaries into > /opt/cloudera/parcels/CDH/lib/hadoop/lib/native >=20 > * Symlinking the gpl compression binaries into /usr/lib >=20 > And nothing worked. When I run pyspark I get this: >=20 > 14/09/17 20:38:54 WARN util.NativeCodeLoader: Unable to load = native-hadoop > library for your platform... using builtin-java classes where = applicable >=20 > and when I try to run a simple job on a LZO file in HDFS I get this: >=20 > distFile.count() > 14/09/17 13:51:54 ERROR GPLNativeCodeLoader: Could not load native gpl > library > java.lang.UnsatisfiedLinkError: no gplcompression in java.library.path > at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1886) > at java.lang.Runtime.loadLibrary0(Runtime.java:849) > at java.lang.System.loadLibrary(System.java:1088) > at > = com.hadoop.compression.lzo.GPLNativeCodeLoader.(GPLNativeCodeLoade= r.java:32) > at = com.hadoop.compression.lzo.LzoCodec.(LzoCodec.java:71) >=20 > Can anybody help please? Many thanks. >=20 >=20 >=20 > -- > View this message in context: = http://apache-spark-user-list.1001560.n3.nabble.com/LZO-support-in-Spark-1= -0-0-nothing-seems-to-work-tp14494.html > Sent from the Apache Spark User List mailing list archive at = Nabble.com. >=20 > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscribe@spark.apache.org > For additional commands, e-mail: user-help@spark.apache.org >=20 --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscribe@spark.apache.org For additional commands, e-mail: user-help@spark.apache.org