Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 8B5D2200BE2 for ; Thu, 15 Dec 2016 12:39:35 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 89D81160B10; Thu, 15 Dec 2016 11:39:35 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 3D8BA160B0B for ; Thu, 15 Dec 2016 12:39:34 +0100 (CET) Received: (qmail 76934 invoked by uid 500); 15 Dec 2016 11:39:28 -0000 Mailing-List: contact user-help@flink.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@flink.apache.org Delivered-To: mailing list user@flink.apache.org Received: (qmail 76924 invoked by uid 99); 15 Dec 2016 11:39:28 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 15 Dec 2016 11:39:28 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id DC83A1A02DE for ; Thu, 15 Dec 2016 11:39:27 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.879 X-Spam-Level: * X-Spam-Status: No, score=1.879 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd2-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id SKenQKxwY08P for ; Thu, 15 Dec 2016 11:39:26 +0000 (UTC) Received: from mail-wj0-f178.google.com (mail-wj0-f178.google.com [209.85.210.178]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id 9FBD65F4E7 for ; Thu, 15 Dec 2016 11:39:25 +0000 (UTC) Received: by mail-wj0-f178.google.com with SMTP id xy5so61934462wjc.0 for ; Thu, 15 Dec 2016 03:39:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=w9lhx8VYBDOxhm/DQKFD7RkLJ7eLmlCPvJQPNM4OAhY=; b=QkzgNcenNpe82hlOcaKdv1ioymvajk6rOBmu4Lk6amdx4hzDjmqnJAWuTbEn8/UJ+H QeYvBXRPmhutJyiVrmfS5/QdT2w0FB2Wwsk7jPknpFY5ljDCcelGlFtv5Z2ENEZZ8f5B 0ihyxH6HsNvXXdFxivMBPUpFERgRxyVKDlqmMmDZ2NpnhnHGoZyy1/wGj/2Cn9KC/seo gdoA8yy80TmKtf6Bp8cw2q1mrALDvYdnb1uVDCndvdeCrsD9h/yDF/kgpCabZBcR7nTq RLmcfbujRLWqrAli9KBItCjUPapxRSWduoCv/kd4t0now+tQY+p9UjpLud/kvykt8nO9 XjMQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=w9lhx8VYBDOxhm/DQKFD7RkLJ7eLmlCPvJQPNM4OAhY=; b=MjdlHebgkodwvD0hk8CBr4KnrNl+Jcsn/zba4q2yoeMz2cEV8GtaF3flCnXE+t4AoJ lZPnW4d8XuNQ+t2lh8dMbB1wdnFRY0cJZf2cOtm7mKMKXeiIXFbSRnm8mpuM1ty2l6eq 8SuLZbpcignFAg+plLRC/WTM16pd4q22etRdibX8fACO1k6V0iBeHPprIVH6k9hRGD3z DVQz2Z9CResdWm5pwqtGD1VwIDzlU4WyfC8R9pq2ca20FVLen4IOX7S/IJ9uUUT6cxU0 PPV+T0y4LRI9eqIVuIhXeXgtz6c8SJ5GexMFmP8bXRMANOrF1TwcSN0juysjvD3ngwgk ZhrA== X-Gm-Message-State: AKaTC01uj/NPHT+fCNn5pfdpmmvMF1Tv/Hae5cKiE0AhlPtw/QGBUiTaT3eYLCtYle4+B2QhBP6nQ+fkcXkP0w== X-Received: by 10.194.141.239 with SMTP id rr15mr950815wjb.144.1481801958246; Thu, 15 Dec 2016 03:39:18 -0800 (PST) MIME-Version: 1.0 Received: by 10.194.51.104 with HTTP; Thu, 15 Dec 2016 03:39:17 -0800 (PST) In-Reply-To: References: From: Yury Ruchin Date: Thu, 15 Dec 2016 14:39:17 +0300 Message-ID: Subject: Re: Jar hell when running streaming job in YARN session To: user@flink.apache.org Content-Type: multipart/alternative; boundary=089e01229acc59502a0543b0e79e archived-at: Thu, 15 Dec 2016 11:39:35 -0000 --089e01229acc59502a0543b0e79e Content-Type: text/plain; charset=UTF-8 Hi Kidong, Stephan, First of all, you've saved me days of investigation - thanks a lot! The problem is solved now. More details follow. I use the official Flink 1.1.3 + Hadoop 2.7 distribution. My problem was indeed caused by clash of classes under "com.google" in my fat jar and in the YARN library directories. The shaded Guava classes in Flink distribution didn't hurt. Initially I took the wrong way - I tried to change class loading order. Instead, I should have used the same shading approach that Flink uses and that Kidong described above - simply relocate problematic classes to other package in fat jar. Thanks again, Yury 2016-12-15 14:21 GMT+03:00 Stephan Ewen : > Hi Yuri! > > Flink should hide Hadoop's Guava, to avoid this issue. > > Did you build Flink yourself from source? Maybe you are affected by this > issue: https://ci.apache.org/projects/flink/flink-docs- > release-1.2/setup/building.html#dependency-shading > > Stephan > > > On Thu, Dec 15, 2016 at 11:18 AM, Kidong Lee wrote: > >> To avoid guava conflict, I use maven shade plugin to package my fat jar. >> If you use maven, the shade plugin looks like this: >> ... >> >> >> org.apache.maven.plugins >> maven-shade-plugin >> 2.4.2 >> >> false >> true >> flink-job >> >> >> com.google >> yourpackage.shaded.google >> >> >> >> >> >> META-INF/spring.handlers >> >> >> META-INF/spring.schemas >> >> >> >> >> *:* >> >> org/datanucleus/** >> META-INF/*.SF >> META-INF/*.DSA >> META-INF/*.RSA >> >> >> >> >> >> ... >> >> >> To package fat jar: >> >> mvn -e -DskipTests=true clean install shade:shade; >> >> >> I hope, it helps. >> >> - Kidong Lee. >> >> >> >> >> >> 2016-12-15 19:04 GMT+09:00 Yury Ruchin : >> >>> Hi, >>> >>> I have run into a classpath issue when running Flink streaming job in >>> YARN session. I package my app into a fat jar with all the dependencies >>> needed. One of them is Google Guava. I then submit the jar to the session. >>> The task managers pre-created by the session build their classpath from the >>> FLINK_LIB_DIR and Hadoop / YARN lib directories. Unfortunately, there is a >>> dated Guava version pulled along with Hadoop dependencies which conflicts >>> with the version my app needs. Even worse, the Flink lib dir and Hadoop >>> libraries take precedence over my jar. >>> >>> If I remember correctly, in Spark there is an option meaning "user >>> classpath goes first when looking for classes". I couldn't find anything >>> similar for Flink. I tried "flink run -C file:///path/to/my/libraries" in >>> the hope to extend the classpath but the old Guava version takes precedence >>> anyway. >>> >>> How else can I bump "priority" of my app jar during the load process? >>> >>> Thanks, >>> Yury >>> >> >> > --089e01229acc59502a0543b0e79e Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Hi Kidong, Stephan,

First of all, you&#= 39;ve saved me days of investigation - thanks a lot! The problem is solved = now. More details follow.

I use the official Flink= 1.1.3 + Hadoop 2.7 distribution. My problem was indeed caused by clash of = classes under "com.google" in my fat jar and in the YARN library = directories. The shaded Guava classes in Flink distribution didn't hurt= . Initially I took the wrong way - I tried to change class loading order. I= nstead, I should have used the same shading approach that Flink uses and th= at Kidong described above - simply relocate problematic classes to other pa= ckage in fat jar.

Thanks again,
Yury

2016-12-1= 5 14:21 GMT+03:00 Stephan Ewen <sewen@apache.org>:
Hi Yuri!

Flink = should hide Hadoop's Guava, to avoid this issue.

Did you build Flink yourself from source? Maybe you are affected by this= issue:=C2=A0https://= ci.apache.org/projects/flink/flink-docs-release-1.2/setup/buildin= g.html#dependency-shading

Stephan

=
On Thu, Dec 15, 2016 at 11:18 AM, Kidong Lee <= span dir=3D"ltr"><mykidong@gmail.com> wrote:
To avoid guava conflict, I use maven shade plugin t= o package my fat jar.
If you use maven, the shade plugin looks lik= e this:
...
<plugin>
<groupId= >org.apache.mav= en.plugins</groupId>
<<= /span>artifactId>maven-shade-plugin</artifactId>
<version>2.4.2</version>
<configuration>
<
creat= eDependencyReducedPom>false&= lt;/createDependencyReducedPom>
<shadedArtifactAttached= >true</
= shadedArtifactAttached>
= <shadedClassifierName>flink-job</shadedClassifierName<= span style=3D"background-color:rgb(239,239,239)">>

<relocati= ons>
= <relocation&g= t;
= <pattern>com.google</pattern>
<shadedPattern>yourpackage.shaded.g= oogle</shadedPattern&g= t;
<= ;/relocation>
</relocations>
<transformers>
<transformer
implementation=3D"org.apache.maven.= plugins.shade.resource.ServicesResourceTransformer"/>
<transforme= r implementation=3D"org.apache.mav= en.plugins.shade.resource.AppendingTransformer">
<resourc= e>META-I= NF/spring.handlers&l= t;/resource>
</transformer>
<transformer implementation=3D"org.apache.maven.plugins.sh= ade.resource.AppendingTransformer">
<resource>META-INF/spring.schemas</resource>
</transformer>
</transformers>
<filters>
<filter>
<artifact
>*:*</artifact>
= <excludes>
<= ;exclude>org/datanucleus/**</exclude>
<exclude>
META-INF/*.SF</exclude> <exclude>META-INF/*.DSA</exclude>
<exclude>META-INF/*.RSA</exclude>
= </excludes>
</filter>
&l= t;/filters>
</configuration>
</plugin>
...


T=
o package fat jar:
mvn -e -DskipTests=3Dtrue clean install shade:shade;


I hope, it helps.

- Kidong Lee.





2016-12-15 19:04 GMT= +09:00 Yury Ruchin <yuri.ruchin@gmail.com>:
Hi,

I have run in= to a classpath issue when running Flink streaming job in YARN session. I pa= ckage my app into a fat jar with all the dependencies needed. One of them i= s Google Guava. I then submit the jar to the session. The task managers pre= -created by the session build their classpath from the FLINK_LIB_DIR and Ha= doop / YARN lib directories. Unfortunately, there is a dated Guava version = pulled along with Hadoop dependencies which conflicts with the version my a= pp needs. Even worse, the Flink lib dir and Hadoop libraries take precedenc= e over my jar.

If I remember correctly, in Spark t= here is an option meaning "user classpath goes first when looking for = classes". I couldn't find anything similar for Flink. I tried &quo= t;flink run -C file:///path/to/my/libraries" in the hope to extend the= classpath but the old Guava version takes precedence anyway.
How else can I bump "priority" of my app jar during t= he load process?

Thanks,
Yury



--089e01229acc59502a0543b0e79e--