flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yury Ruchin <yuri.ruc...@gmail.com>
Subject Re: Jar hell when running streaming job in YARN session
Date Thu, 15 Dec 2016 11:39:17 GMT
Hi Kidong, Stephan,

First of all, you've saved me days of investigation - thanks a lot! The
problem is solved now. More details follow.

I use the official Flink 1.1.3 + Hadoop 2.7 distribution. My problem was
indeed caused by clash of classes under "com.google" in my fat jar and in
the YARN library directories. The shaded Guava classes in Flink
distribution didn't hurt. Initially I took the wrong way - I tried to
change class loading order. Instead, I should have used the same shading
approach that Flink uses and that Kidong described above - simply relocate
problematic classes to other package in fat jar.

Thanks again,
Yury

2016-12-15 14:21 GMT+03:00 Stephan Ewen <sewen@apache.org>:

> Hi Yuri!
>
> Flink should hide Hadoop's Guava, to avoid this issue.
>
> Did you build Flink yourself from source? Maybe you are affected by this
> issue: https://ci.apache.org/projects/flink/flink-docs-
> release-1.2/setup/building.html#dependency-shading
>
> Stephan
>
>
> On Thu, Dec 15, 2016 at 11:18 AM, Kidong Lee <mykidong@gmail.com> wrote:
>
>> To avoid guava conflict, I use maven shade plugin to package my fat jar.
>> If you use maven, the shade plugin looks like this:
>> ...
>>
>> <plugin>
>>    <groupId>org.apache.maven.plugins</groupId>
>>    <artifactId>maven-shade-plugin</artifactId>
>>    <version>2.4.2</version>
>>    <configuration>
>>       <createDependencyReducedPom>false</createDependencyReducedPom>
>>       <shadedArtifactAttached>true</shadedArtifactAttached>
>>       <shadedClassifierName>flink-job</shadedClassifierName>
>>       <relocations>
>>          <relocation>
>>             <pattern>com.google</pattern>
>>             <shadedPattern>yourpackage.shaded.google</shadedPattern>
>>          </relocation>
>>       </relocations>
>>       <transformers>
>>          <transformer implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer"/>
>>          <transformer implementation="org.apache.maven.plugins.shade.resource.AppendingTransformer">
>>             <resource>META-INF/spring.handlers</resource>
>>          </transformer>
>>          <transformer implementation="org.apache.maven.plugins.shade.resource.AppendingTransformer">
>>             <resource>META-INF/spring.schemas</resource>
>>          </transformer>
>>       </transformers>
>>       <filters>
>>          <filter>
>>             <artifact>*:*</artifact>
>>             <excludes>
>>                <exclude>org/datanucleus/**</exclude>
>>                <exclude>META-INF/*.SF</exclude>
>>                <exclude>META-INF/*.DSA</exclude>
>>                <exclude>META-INF/*.RSA</exclude>
>>             </excludes>
>>          </filter>
>>       </filters>
>>    </configuration>
>> </plugin>
>> ...
>>
>>
>> To package fat jar:
>>
>> mvn -e -DskipTests=true clean install shade:shade;
>>
>>
>> I hope, it helps.
>>
>> - Kidong Lee.
>>
>>
>>
>>
>>
>> 2016-12-15 19:04 GMT+09:00 Yury Ruchin <yuri.ruchin@gmail.com>:
>>
>>> Hi,
>>>
>>> I have run into a classpath issue when running Flink streaming job in
>>> YARN session. I package my app into a fat jar with all the dependencies
>>> needed. One of them is Google Guava. I then submit the jar to the session.
>>> The task managers pre-created by the session build their classpath from the
>>> FLINK_LIB_DIR and Hadoop / YARN lib directories. Unfortunately, there is a
>>> dated Guava version pulled along with Hadoop dependencies which conflicts
>>> with the version my app needs. Even worse, the Flink lib dir and Hadoop
>>> libraries take precedence over my jar.
>>>
>>> If I remember correctly, in Spark there is an option meaning "user
>>> classpath goes first when looking for classes". I couldn't find anything
>>> similar for Flink. I tried "flink run -C file:///path/to/my/libraries" in
>>> the hope to extend the classpath but the old Guava version takes precedence
>>> anyway.
>>>
>>> How else can I bump "priority" of my app jar during the load process?
>>>
>>> Thanks,
>>> Yury
>>>
>>
>>
>

Mime
View raw message