Return-Path: X-Original-To: apmail-flume-user-archive@www.apache.org Delivered-To: apmail-flume-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 0FE82DA88 for ; Fri, 3 Aug 2012 18:22:10 +0000 (UTC) Received: (qmail 9334 invoked by uid 500); 3 Aug 2012 18:22:09 -0000 Delivered-To: apmail-flume-user-archive@flume.apache.org Received: (qmail 9302 invoked by uid 500); 3 Aug 2012 18:22:09 -0000 Mailing-List: contact user-help@flume.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@flume.apache.org Delivered-To: mailing list user@flume.apache.org Received: (qmail 9294 invoked by uid 99); 3 Aug 2012 18:22:09 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 03 Aug 2012 18:22:09 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=FSL_RCVD_USER,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of eran@gigya-inc.com designates 209.85.161.179 as permitted sender) Received: from [209.85.161.179] (HELO mail-gg0-f179.google.com) (209.85.161.179) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 03 Aug 2012 18:22:02 +0000 Received: by ggnk3 with SMTP id k3so1115719ggn.38 for ; Fri, 03 Aug 2012 11:21:41 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:sender:in-reply-to:references:from:date :x-google-sender-auth:message-id:subject:to:content-type :x-gm-message-state; bh=8JpRlZirQ/eKBoS3Qs12tbC1CKyCiwM9q68IvXOxE1s=; b=W708YlyuRjtht20Xj5D8kR7Wzt/pBO6sREgzCogUJgfl1UASSWb91N/mJLW4YV3PKH p4JrNIVQ2pgcdAUiCWYVDAw9w+9EzDkxKBVIqCLmH+UHGp+X6KDsvsPVJVZkijtnYB+2 OaobBHJNTxsvVMmI8sGNj+Tr5/9Thf6R05Lm/GWTGzby3FN6yxsoxzKeVJrixTGJkLHw GWuNri1+SpxSruh4h/70QdXkFKIJkR0bFxHzRfT3RRHhvh/oLNLjnyTJJKecQN9KmLIn JHFAlLa/Q1+FAlahEb9cYI7MgRh76jZrUNhIcFWbKU24fWt27Vw7C/VcVySQuKVL+T92 06iw== Received: by 10.50.36.137 with SMTP id q9mr12402441igj.14.1344018100964; Fri, 03 Aug 2012 11:21:40 -0700 (PDT) MIME-Version: 1.0 Sender: eran@gigya-inc.com Received: by 10.42.118.138 with HTTP; Fri, 3 Aug 2012 11:21:10 -0700 (PDT) In-Reply-To: References: From: Eran Kutner Date: Fri, 3 Aug 2012 21:21:10 +0300 X-Google-Sender-Auth: SxGd5jmqSpOkTgHSiP1rF6lW1r0 Message-ID: Subject: Re: Can't use snappy codec To: user@flume.apache.org Content-Type: multipart/alternative; boundary=14dae9340de57acd7704c66099c0 X-Gm-Message-State: ALoCoQlfvqUf07tfRmP1mR3LQAWb6+rtfHz2MJ+WCuQ7hCSVkBAB5t5qfxb989U0pc+WsKuYKKjI --14dae9340de57acd7704c66099c0 Content-Type: text/plain; charset=ISO-8859-1 Hi Patrick, With the new script it's working ok. The old script (the one that comes with cdh4) had a problem This code in the old script: local HADOOP_JAVA_LIBRARY_PATH=$(HADOOP_CLASSPATH="$FLUME_CLASSPATH" \ ${HADOOP_IN_PATH} org.apache.flume.tools.GetJavaProperty \ java.library.path 2>/dev/null) Would set the HADOOP_JAVA_LIBRARY_PATH to "java.library.path=//usr/lib/hadoop/lib/native" Which would end up setting FLUME_JAVA_LIBRARY_PATH to ":java.library.path=//usr/lib/hadoop/lib/native" That would then be used to start the processes as "-Djava.library.path=:java.library.path=//usr/lib/hadoop/lib/native" which is obviously wrong. The new script has code to clean up the extra "java.library.path" returned by the above code. So there are two options, either the implementation of org.apache.flume.tools.GetJavaProperty changed, so it now returns the extra parameter name and therefore it was required to remove it in the script, or the original one included in CDH4 had a bug that was fixed in 1.2 -eran On Fri, Aug 3, 2012 at 7:43 PM, Patrick Wendell wrote: > Hey Eran, > > So the flume-ng script works by trying to figure out what library path > Hadoop is using and then replicating that for flume. If > HADOOP_JAVA_LIBRARY_PATH is set it will try to use that. Otherwise it > tries to infer the path based on what the hadoop script itself > determines. > > What is the path getting set to in your case and how does that differ > from expecations? Just trying to figure out what the bug is. > > - Patrick > > On Fri, Aug 3, 2012 at 9:25 AM, Eran Kutner wrote: > > Thanks Patrick, that helped me figure out the problem and it looks like a > > bug in the "flume-ng" file provided with CDH4, it was messing the > > library.path. > > I copied the file that was included in flume 1.2.0 distribution and it > now > > works ok. > > > > Thanks for your help. > > > > -eran > > > > > > > > On Fri, Aug 3, 2012 at 6:36 PM, Patrick Wendell > wrote: > >> > >> Hey Eran, > >> > >> You need to make sure the Flume JVM gets passed > >> -Djava.library.path=XXX with the correct path to where your native > >> snappy libraries are located. > >> > >> You can set this by adding the option directly to the flume-ng runner > >> script. > >> > >> - Patrick > >> > >> On Fri, Aug 3, 2012 at 7:33 AM, Eran Kutner wrote: > >> > Hi, > >> > I'm trying to use the snappy codec but keep getting "native snappy > >> > library > >> > not available" errors. > >> > I'm using CDH4 but replaced the flume 1.1 JARs that are included with > >> > that > >> > distribution with flume 1.2 JARs. > >> > I tried anything I can think of, including symlinking the hadoop > native > >> > library under flume-ng/lib/ dirctory both nothing helps. > >> > Any idea how to resolve this? > >> > > >> > This is the error: > >> > 2012-08-03 10:23:30,598 WARN util.NativeCodeLoader: Unable to load > >> > native-hadoop library for your platform... using builtin-java classes > >> > where > >> > applicable > >> > 2012-08-03 10:23:35,670 WARN hdfs.HDFSEventSink: HDFS IO error > >> > java.io.IOException: java.lang.RuntimeException: native snappy library > >> > not > >> > available > >> > at > >> > org.apache.flume.sink.hdfs.BucketWriter.doOpen(BucketWriter.java:202) > >> > at > >> > > org.apache.flume.sink.hdfs.BucketWriter.access$000(BucketWriter.java:48) > >> > at > >> > org.apache.flume.sink.hdfs.BucketWriter$1.run(BucketWriter.java:155) > >> > at > >> > org.apache.flume.sink.hdfs.BucketWriter$1.run(BucketWriter.java:152) > >> > at > >> > > >> > > org.apache.flume.sink.hdfs.BucketWriter.runPrivileged(BucketWriter.java:125) > >> > at > >> > org.apache.flume.sink.hdfs.BucketWriter.open(BucketWriter.java:152) > >> > at > >> > org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:307) > >> > at > >> > > org.apache.flume.sink.hdfs.HDFSEventSink$1.call(HDFSEventSink.java:717) > >> > at > >> > > org.apache.flume.sink.hdfs.HDFSEventSink$1.call(HDFSEventSink.java:714) > >> > at > >> > java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) > >> > at java.util.concurrent.FutureTask.run(FutureTask.java:138) > >> > at > >> > > >> > > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) > >> > at > >> > > >> > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) > >> > at java.lang.Thread.run(Thread.java:662) > >> > Caused by: java.lang.RuntimeException: native snappy library not > >> > available > >> > at > >> > > >> > > org.apache.hadoop.io.compress.SnappyCodec.createCompressor(SnappyCodec.java:135) > >> > at > >> > > >> > > org.apache.hadoop.io.compress.SnappyCodec.createOutputStream(SnappyCodec.java:84) > >> > at > >> > > >> > > org.apache.flume.sink.hdfs.HDFSCompressedDataStream.open(HDFSCompressedDataStream.java:70) > >> > at > >> > org.apache.flume.sink.hdfs.BucketWriter.doOpen(BucketWriter.java:195) > >> > ... 13 more > >> > > >> > And my sink configuration: > >> > flume05.sinks.hdfsSink.type = hdfs > >> > #flume05.sinks.hdfsSink.type = logger > >> > flume05.sinks.hdfsSink.channel = memoryChannel > >> > > >> > > flume05.sinks.hdfsSink.hdfs.path=hdfs://hadoop2-m1:8020/test-events/%Y-%m-%d > >> > flume05.sinks.hdfsSink.hdfs.filePrefix=raw-events.avro > >> > flume05.sinks.hdfsSink.hdfs.rollInterval=60 > >> > flume05.sinks.hdfsSink.hdfs.rollCount=0 > >> > flume05.sinks.hdfsSink.hdfs.rollSize=0 > >> > flume05.sinks.hdfsSink.hdfs.fileType=CompressedStream > >> > flume05.sinks.hdfsSink.hdfs.codeC=snappy > >> > flume05.sinks.hdfsSink.hdfs.writeFormat=Text > >> > flume05.sinks.hdfsSink.hdfs.batchSize=1000 > >> > flume05.sinks.hdfsSink.serializer = avro_event > >> > > >> > Thanks. > >> > > >> > -eran > >> > > > > > > --14dae9340de57acd7704c66099c0 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
Hi Patrick,

With the new script it's working ok= .
The old script (the one that comes with cdh4) had a problem

Thi= s code in the old script:
=A0=A0=A0 local HADOOP_JAVA_LIBRARY_PATH=3D$(H= ADOOP_CLASSPATH=3D"$FLUME_CLASSPATH" \
=A0=A0=A0=A0=A0=A0=A0 ${HADOOP_IN_PATH} org.apache.flume.tools.GetJavaPrope= rty \
=A0=A0=A0=A0=A0=A0=A0 java.library.path 2>/dev/null)

Wou= ld set the HADOOP_JAVA_LIBRARY_PATH to "java.library.path=3D//usr/lib/= hadoop/lib/native"
Which would end up setting FLUME_JAVA_LIBRARY_PATH to ":java.library.p= ath=3D//usr/lib/hadoop/lib/native"
That would then be used to start= the processes as "-Djava.library.path=3D:java.library.path=3D//usr/li= b/hadoop/lib/native" which is obviously wrong.

The new script has code to clean up the extra "java.library.path&q= uot; returned by the above code.
So there are two options, either the im= plementation of org.apache.flume.tools.GetJavaProperty changed, so it now r= eturns the extra parameter name and therefore it was required to remove it = in the script, or the original one included in CDH4 had a bug that was fixe= d in 1.2

-= eran



On Fri, Aug 3, 2012 at 7:43 PM, Patrick = Wendell <pwendell@gmail.com> wrote:
Hey Eran,

So the flume-ng script works by trying to figure out what library path
Hadoop is using and then replicating that for flume. If
HADOOP_JAVA_LIBRARY_PATH is set it will try to use that. Otherwise it
tries to infer the path based on what the hadoop script itself
determines.

What is the path getting set to in your case and how does that differ
from expecations? Just trying to figure out what the bug is.

- Patrick

On Fri, Aug 3, 2012 at 9:25 AM, Eran Kutner <eran@gigya.com> wrote:
> Thanks Patrick, that helped me figure out the problem and it looks lik= e a
> bug in the "flume-ng" file provided with CDH4, it was messin= g the
> library.path.
> I copied the file that was included in flume 1.2.0 distribution and it= now
> works ok.
>
> Thanks for your help.
>
> -eran
>
>
>
> On Fri, Aug 3, 2012 at 6:36 PM, Patrick Wendell <pwendell@gmail.com> wrote:
>>
>> Hey Eran,
>>
>> You need to make sure the Flume JVM gets passed
>> -Djava.library.path=3DXXX with the correct path to where your nati= ve
>> snappy libraries are located.
>>
>> You can set this by adding the option directly to the flume-ng run= ner
>> script.
>>
>> - Patrick
>>
>> On Fri, Aug 3, 2012 at 7:33 AM, Eran Kutner <eran@gigya.com> wrote:
>> > Hi,
>> > I'm trying to use the snappy codec but keep getting "= ;native snappy
>> > library
>> > not available" errors.
>> > I'm using CDH4 but replaced the flume 1.1 JARs that are i= ncluded with
>> > that
>> > distribution with flume 1.2 JARs.
>> > I tried anything I can think of, including symlinking the had= oop native
>> > library under flume-ng/lib/ dirctory both nothing helps.
>> > Any idea how to resolve this?
>> >
>> > This is the error:
>> > 2012-08-03 10:23:30,598 WARN util.NativeCodeLoader: Unable to= load
>> > native-hadoop library for your platform... using builtin-java= classes
>> > where
>> > applicable
>> > 2012-08-03 10:23:35,670 WARN hdfs.HDFSEventSink: HDFS IO erro= r
>> > java.io.IOException: java.lang.RuntimeException: native snapp= y library
>> > not
>> > available
>> > =A0 =A0 =A0 =A0 at
>> > org.apache.flume.sink.hdfs.BucketWriter.doOpen(BucketWriter.j= ava:202)
>> > =A0 =A0 =A0 =A0 at
>> > org.apache.flume.sink.hdfs.BucketWriter.access$000(BucketWrit= er.java:48)
>> > =A0 =A0 =A0 =A0 at
>> > org.apache.flume.sink.hdfs.BucketWriter$1.run(BucketWriter.ja= va:155)
>> > =A0 =A0 =A0 =A0 at
>> > org.apache.flume.sink.hdfs.BucketWriter$1.run(BucketWriter.ja= va:152)
>> > =A0 =A0 =A0 =A0 at
>> >
>> > org.apache.flume.sink.hdfs.BucketWriter.runPrivileged(BucketW= riter.java:125)
>> > =A0 =A0 =A0 =A0 at
>> > org.apache.flume.sink.hdfs.BucketWriter.open(BucketWriter.jav= a:152)
>> > =A0 =A0 =A0 =A0 at
>> > org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.j= ava:307)
>> > =A0 =A0 =A0 =A0 at
>> > org.apache.flume.sink.hdfs.HDFSEventSink$1.call(HDFSEventSink= .java:717)
>> > =A0 =A0 =A0 =A0 at
>> > org.apache.flume.sink.hdfs.HDFSEventSink$1.call(HDFSEventSink= .java:714)
>> > =A0 =A0 =A0 =A0 at
>> > java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java= :303)
>> > =A0 =A0 =A0 =A0 at java.util.concurrent.FutureTask.run(Future= Task.java:138)
>> > =A0 =A0 =A0 =A0 at
>> >
>> > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Thread= PoolExecutor.java:886)
>> > =A0 =A0 =A0 =A0 at
>> >
>> > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPool= Executor.java:908)
>> > =A0 =A0 =A0 =A0 at java.lang.Thread.run(Thread.java:662)
>> > Caused by: java.lang.RuntimeException: native snappy library = not
>> > available
>> > =A0 =A0 =A0 =A0 at
>> >
>> > org.apache.hadoop.io.compress.SnappyCodec.createCompressor(Sn= appyCodec.java:135)
>> > =A0 =A0 =A0 =A0 at
>> >
>> > org.apache.hadoop.io.compress.SnappyCodec.createOutputStream(= SnappyCodec.java:84)
>> > =A0 =A0 =A0 =A0 at
>> >
>> > org.apache.flume.sink.hdfs.HDFSCompressedDataStream.open(HDFS= CompressedDataStream.java:70)
>> > =A0 =A0 =A0 =A0 at
>> > org.apache.flume.sink.hdfs.BucketWriter.doOpen(BucketWriter.j= ava:195)
>> > =A0 =A0 =A0 =A0 ... 13 more
>> >
>> > And my sink configuration:
>> > flume05.sinks.hdfsSink.type =3D hdfs
>> > #flume05.sinks.hdfsSink.type =3D logger
>> > flume05.sinks.hdfsSink.channel =3D memoryChannel
>> >
>> > flume05.sinks.hdfsSink.hdfs.path=3Dhdfs://hadoop2-m1:8020/tes= t-events/%Y-%m-%d
>> > flume05.sinks.hdfsSink.hdfs.filePrefix=3Draw-events.avro
>> > flume05.sinks.hdfsSink.hdfs.rollInterval=3D60
>> > flume05.sinks.hdfsSink.hdfs.rollCount=3D0
>> > flume05.sinks.hdfsSink.hdfs.rollSize=3D0
>> > flume05.sinks.hdfsSink.hdfs.fileType=3DCompressedStream
>> > flume05.sinks.hdfsSink.hdfs.codeC=3Dsnappy
>> > flume05.sinks.hdfsSink.hdfs.writeFormat=3DText
>> > flume05.sinks.hdfsSink.hdfs.batchSize=3D1000
>> > flume05.sinks.hdfsSink.serializer =3D avro_event
>> >
>> > Thanks.
>> >
>> > -eran
>> >
>
>

--14dae9340de57acd7704c66099c0--