flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chesnay Schepler <ches...@apache.org>
Subject Re: [DISCUSS] Changing Flink's shading model
Date Tue, 20 Jun 2017 11:15:28 GMT
I would like to start working on this.

I've looked into adding a flink-shaded-guava module. Working against the 
shaded namespaces seems
to work without problems from the IDE, and we could forbid un-shaded 
usages with checkstyle.

So for the list of dependencies that we want to shade we currently got:

  * asm
  * guava
  * netty
  * hadoop
  * curator

I've had a chat with Stephan Ewan and he brought up kryo + chill as well.

The nice thing is that we can do this incrementally, one dependency at a 
time. As such i would propose
to go through the whole process for guava and see what problems arise.

This would include adding a flink-shaded module and a child 
flink-shaded-guava module to the flink repository
that are not part of the build process, replacing all usages of guava in 
Flink, adding the
checkstyle rule (optional) and deploying the artifact to maven central.

On 11.05.2017 10:54, Stephan Ewen wrote:
> @Ufuk  - I have never set up artifact deployment in Maven, could need some
> help there.
>
> Regarding shading Netty, I agree, would be good to do that as well...
>
> On Thu, May 11, 2017 at 10:52 AM, Ufuk Celebi <uce@apache.org> wrote:
>
>> The advantages you've listed sound really compelling to me.
>>
>> - Do you have time to implement these changes or do we need a volunteer? ;)
>>
>> - I assume that republishing the artifacts as you propose doesn't have
>> any new legal implications since we already publish them with our
>> JARs, right?
>>
>> - We might think about adding Netty to the list of shaded artifacts
>> since some dependency conflicts were reported recently. Would have to
>> double check the reported issues before doing that though. ;-)
>>
>> – Ufuk
>>
>>
>> On Wed, May 10, 2017 at 8:45 PM, Stephan Ewen <sewen@apache.org> wrote:
>>> @chesnay: I used ASM as an example in the proposal. Maybe I did not say
>>> that clearly.
>>>
>>> If we like that approach, we should deal with the other libraries (at
>> least
>>> the frequently used ones) in the same way.
>>>
>>>
>>> I would imagine to have a project layout like that:
>>>
>>> flink-shaded-deps
>>>    - flink-shaded-asm
>>>    - flink-shaded-guava
>>>    - flink-shaded-curator
>>>    - flink-shaded-hadoop
>>>
>>>
>>> "flink-shaded-deps" would not be built every time (and not be released
>>> every time), but only when needed.
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Wed, May 10, 2017 at 7:28 PM, Chesnay Schepler <chesnay@apache.org>
>>> wrote:
>>>
>>>> I like the idea, thank you for bringing it up.
>>>>
>>>> Given that the raised problems aren't really ASM specific would it make
>>>> sense to create one flink-shaded module that contains all frequently
>> shaded
>>>> libraries? (or maybe even all shaded dependencies by core modules) The
>>>> proposal limits the scope of this to ASM and i was wondering why.
>>>>
>>>> I also remember that there was a discussion recently about why we shade
>>>> things at all, and the idea of working against the shaded namespaces was
>>>> brought up. Back then i was expressing doubts as to whether IDE's would
>>>> properly support this; what's the state on that?
>>>>
>>>> On 10.05.2017 18:18, Stephan Ewen wrote:
>>>>
>>>>> Hi!
>>>>>
>>>>> This is a discussion about altering the way we handle dependencies and
>>>>> shading in Flink.
>>>>> I ran into quite a view problems trying to adjust / fix some shading
>>>>> issues
>>>>> during release validation.
>>>>>
>>>>> The issue is tracked under: https://issues.apache.org/jira
>>>>> /browse/FLINK-6529
>>>>> Bring this discussion thread up because it is a bigger issue
>>>>>
>>>>> *Problem*
>>>>>
>>>>> Currently, Flink shades dependencies like ASM and Guava into all jars
>> of
>>>>> projects that reference it and relocate the classes.
>>>>>
>>>>> There are some drawbacks to that approach, let's discuss them at the
>>>>> example of ASM:
>>>>>
>>>>>     - The ASM classes are for example in flink-core, flink-java,
>>>>> flink-scala,
>>>>> flink-runtime, etc.
>>>>>
>>>>>     - Users that reference these dependencies have the classes multiple
>>>>> times
>>>>> in the classpath. That is unclean (works, through, because the classes
>> are
>>>>> identical). The same happens when building the final dist. jar.
>>>>>
>>>>>     - Some of these dependencies require to include license files in
the
>>>>> shaded jar. It is hard to impossible to build a good automatic solution
>>>>> for
>>>>> that, partly due to Maven's very poor cross-project path support
>>>>>
>>>>>     - Most importantly: Scala does not support shading really well.
>> Scala
>>>>> classes have references to classes in more places than just the class
>>>>> names
>>>>> (apparently for Scala reflect support). Referencing a Scala project
>> with
>>>>> shaded ASM still requires to add a reference to unshaded ASM (at least
>> as
>>>>> a
>>>>> compile dependency).
>>>>>
>>>>> *Proposal*
>>>>>
>>>>> I propose that we build and deploy a asm-flink-shaded version of ASM
>> and
>>>>> directly program against the relocated namespaces. Since we never use
>>>>> classes that we relocate in public interfaces, Flink users will never
>> see
>>>>> the relocated class names. Internally, it does not hurt to use them.
>>>>>
>>>>>     - Proper maven dependency management, no hidden (shaded)
>> dependencies
>>>>>     - One copy of each class for shaded dependencies
>>>>>
>>>>>     - Proper Scala interoperability
>>>>>
>>>>>     - Natural License management (license is part of deployed
>>>>> asm-flink-shaded jar)
>>>>>
>>>>>
>>>>> Happy to hear thoughts!
>>>>>
>>>>> Stephan
>>>>>
>>>>>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message