flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-7420) Move all Avro code to flink-avro
Date Fri, 03 Nov 2017 10:50:00 GMT

    [ https://issues.apache.org/jira/browse/FLINK-7420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16237437#comment-16237437
] 

ASF GitHub Bot commented on FLINK-7420:
---------------------------------------

GitHub user StephanEwen opened a pull request:

    https://github.com/apache/flink/pull/4942

     [FLINK-7420] [avro] Move all Avro code to flink-avro (followup)

    ## What is the purpose of the change
    
    This is an extension of #4931 which adds some cleanups and improvements.
    Most notably, it makes the new `flink-avro` module independent of runtime dependencies
and Scala versions.
    
    ## Brief change log
    
      - Move the `SerializationSchema` classes to `flink-core` to make them accessible from
API projects without introducing runtime dependencies. This is done in a non-API-breaking
way.
      - Make `flink-avro` module independent of `flink-streaming-java`, `flink-runtime` and
hence independent of Scala versions.
      - Add various cleanups and fixes
      - Adds a test that validates that a KryoSerializer from Flink 1.3 (which has implicit
Avro dependency classes) is still deserializable with the state serializer utils (that implement
the serializer and state evolution).
    
    ## Verifying this change
    
      - Most of the changes are covered by existing tests
      - Adds a test that validates that a KryoSerializer from Flink 1.3 (which has implicit
Avro dependency classes) is still deserializable with the state serializer utils (that implement
the serializer and state evolution).
    
    ## Does this pull request potentially affect one of the following parts:
    
      - Dependencies (does it add or upgrade a dependency): (**yes** / no): **This changes
the `flink-avro` module to be Scala independent.
    
      - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (**yes**
/ no): **Moves the SerializationSchema classes, but leaves old classes in place (extending
new classes) to preserve the API compatibility**.
    
      - The serializers: (yes / **no** / don't know) 
      - The runtime per-record code paths (performance sensitive): (yes / **no** / don't know)
      - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing,
Yarn/Mesos, ZooKeeper: (yes / **no** / don't know)
    
      - 
    ## Documentation
    
      - Does this pull request introduce a new feature? (yes / **no**)
      - If yes, how is the feature documented? (**not applicable** / docs / JavaDocs / not
documented)
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/StephanEwen/incubator-flink avro

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/4942.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #4942
    
----
commit dd5ca19a7058ebe0ed90237ecb9ea6d9a342a9ec
Author: Stephan Ewen <sewen@apache.org>
Date:   2017-11-02T17:07:25Z

    [FLINK-7972] [core] Move SerializationSchema to 'flink-core'
    
    Moves the SerializationSchema and its related from
    flink-streaming-java to flink-core.
    
    That helps API level projects that depend on those classes
    to not pull in a dependency on runtime classes, and to
    not be Scala version dependent.

commit 85db5d0f6b4ee72808aeaaf2efd38613cf80c89f
Author: Stephan Ewen <sewen@apache.org>
Date:   2017-11-03T10:28:18Z

    [hotfix] [tests] Remove console poluting output in tests of flink-streaming-java

commit d0de088da5e97890300b26517ab158a66a467ea5
Author: twalthr <twalthr@apache.org>
Date:   2017-08-16T10:17:00Z

    [FLINK-7420] [avro] Move all Avro code to flink-avro

commit fa1924b47dbe545a9903d0031124905cb15057cc
Author: Aljoscha Krettek <aljoscha.krettek@gmail.com>
Date:   2017-10-25T15:38:24Z

    [FLINK-7420] [avro] Replace GenericData.Array by dummy when reading TypeSerializers
    
    This also adds a new test that verifies that we correctly register
    Avro Serializers when they are present and modifies an existing test to
    verify that we correctly register dummy classes.

commit 276e8e6ae6deb68b46fc0e4a8bf1821ce6d71b87
Author: Aljoscha Krettek <aljoscha.krettek@gmail.com>
Date:   2017-10-30T14:02:18Z

    [FLINK-7420] [avro] Abstract all Avro interaction behind AvroUtils
    
    Before, we would try and dynamicall load Avro-related classes in several
    places. Now, we only reflectively instantiate the right AvroUtils and
    all other operations are methods on this.
    
    The default AvroUtils throw exceptions with a helpful message for most
    operations.

commit ca4554b399b8b3a72b5381e0e29ed7e10cb95f83
Author: zentol <chesnay@apache.org>
Date:   2017-11-01T11:43:00Z

    [FLINK-7847] [avro] Fix typo in jackson shading pattern
    
    This closes #4931

commit 95f628d0c9f4e296d2ab176fb262a80b7e35a158
Author: Stephan Ewen <sewen@apache.org>
Date:   2017-11-02T14:44:04Z

    [hotfix] [avro] Minor XML formatting cleanup

commit c030e7b2fafa8a77f1bed59eea55f3ff821fd5a6
Author: Stephan Ewen <sewen@apache.org>
Date:   2017-11-02T15:18:21Z

    [FLINK-7420] [avro] Make flink-avro Scala independent
    
    This removes all dependencies on Scala-dependent projects.
    
    This commit introduces a hard wired test dependency to
    'flink-test-utils_2.11' to avoid introducing a Scala version dependency
    due to a non-exported test utility.

commit 22d7d14095646220659c426a7e497be463647dfe
Author: Stephan Ewen <sewen@apache.org>
Date:   2017-11-02T18:48:32Z

    [hotfix] [avro] Fix some serializability warnings and problems

commit da4424ba13c1946ccf7c88ebd659296a73944194
Author: Stephan Ewen <sewen@apache.org>
Date:   2017-11-02T18:51:06Z

    [hotfix] [avro] Simplify the FSDataInputStreamWrapper
    
    The FSDataInputStreamWrapper comes from a time when Flink's FsDataInputStream was not
    position aware. Not that it is, the FSDataInputStreamWrapper is not required to track
    its own position, but can simply delegate these calls to the FsDataInputStream.
    
    This also adds missing @Override tags.

commit 591596de340ad99d82b2a459831b0d5a1925b79e
Author: Stephan Ewen <sewen@apache.org>
Date:   2017-11-02T18:55:33Z

    [hotfix] [avro] Remove incorrect serializability from DataOutputEncoder

commit 66d5cabea8392750cfc2e024b5ea0e4d2800e4ac
Author: Stephan Ewen <sewen@apache.org>
Date:   2017-11-02T20:03:45Z

    [hotfix] [avro] Improve Avro type hierarchy checks in AvroKryoSerializerUtils

commit 33ff132a6e0144df1f98080a166d0f6e69c3dce0
Author: Stephan Ewen <sewen@apache.org>
Date:   2017-11-02T20:12:08Z

    [hotfix] [avro] Improve AvroUtils to perform reflection lookups only once.
    
    This also fixes minor warnings (unchecked casts) and moves the constants
    into scopes that avoid bridge methods for access outside of the nested classes.

commit aec29eb08862d9ecc9ab0f7beebf79fe05c2a7be
Author: Stephan Ewen <sewen@apache.org>
Date:   2017-11-02T21:38:54Z

    [hotfix] [avro] Add test that validates deserialization of Kryo Serializer in the absence
of Avro

----


> Move all Avro code to flink-avro
> --------------------------------
>
>                 Key: FLINK-7420
>                 URL: https://issues.apache.org/jira/browse/FLINK-7420
>             Project: Flink
>          Issue Type: Improvement
>          Components: Build System
>            Reporter: Stephan Ewen
>            Assignee: Aljoscha Krettek
>            Priority: Blocker
>             Fix For: 1.4.0
>
>
> *Problem*
> Currently, the {{flink-avro}} project is a shell with some tests and mostly duplicate
and dead code. The classes that use Avro are distributed quite wildly through the code base,
and introduce multiple direct dependencies on Avro in a messy way.
> That way, we cannot create a proper fat Avro dependency in which we shade Jackson away.
> Also, we expose Avro as a direct and hard dependency on many Flink modules, while it
should be a dependency that users that use Avro types selectively add.
> *Suggested Changes*
> We should move all Avro related classes to {{flink-avro}}, and give {{flink-avro}} a
dependency on {{flink-core}} and {{flink-streaming-java}}.
>   - {{AvroTypeInfo}}
>   - {{AvroSerializer}}
>   - {{AvroRowSerializationSchema}}
>   - {{AvroRowDeserializationSchema}}
> To be able to move the the avro serialization code from {{flink-ore}} to {{flink-avro}},
we need to load the {{AvroTypeInformation}} reflectively, similar to how we load the {{WritableTypeInfo}}
for Hadoop.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message