flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jkovacs <...@git.apache.org>
Subject [GitHub] flink pull request: [FLINK-2576] Add Outer Join operator to Optimi...
Date Fri, 18 Sep 2015 21:10:40 GMT
Github user jkovacs commented on the pull request:

    To partly answer my own question: One big drawback of downgrading the tuple field types
to `GenericTypeInfo` is that for (de)serialization and comparison the generic Kryo serializers
will be used, which are significantly slower than the native flink serializers and comparators
for basic types, such as Integer (according to [this blog post](http://flink.apache.org/news/2015/05/11/Juggling-with-Bits-and-Bytes.html)).
    One obvious way to work around this is to only downgrade the fields that are actually
nullable, and keep the original types of the definitely non-null fields (i.e. the types from
the outer side of a left or right outer join). This way the user can still group/join/sort
efficiently on the non-null fields, while preserving null safety for the other fields.
    I pushed another commit for this to my temporary branch for review, if this makes sense:
    As you can see I was really hoping to make the projection joins work properly :-) but
if you feel that the effort isn't worth it or I'm missing something else entirely, we can
for sure simply scrap that and throw an `InvalidProgramException` when the user tries to do
a project outer join instead of defining his own join udf. Opinions on that are welcome.

If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.

View raw message