flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shiti <...@git.apache.org>
Subject [GitHub] flink pull request: [FLINK-2230] handling null values for TupleSer...
Date Sat, 27 Jun 2015 02:17:47 GMT
Github user Shiti commented on the pull request:

    @StephanEwen, Apologies, I didn't notice the earlier message in jira. Something wrong
with my GMail settings, most of the messages from jira and mailing list went into Spam.
    Kindly excuse my limited understanding of this framework and the intention/drivers behind
the decisions made. 
    Going through the mailing list and the ticket I realized that though there may be some
valid cases of missing data types, it will not be desirable to change the `TupleTypeInfo`
and the whole Tuple/Case Class Serialization code-base to support null and we should identify
an alternative approach to handle this.
    From my limited understanding, the recommended way of working with missing values is to
use `(Option[Int], Option[Int]])` instead of `(Int, Int)`, when we know there can be missing
values in the data. Is that correct?
    If that is correct, I have a few doubts,
    1. Doesn't this push the handling of missing data to the application code (which may be
good or bad), but makes the application code more verbose?
    2. Wouldn't the size of Option[Int] in memory (and also in serialization) be more than
just Int?
    3. If Flink does not support null values except for in the Table API, won’t there be
inconsistency when users try to convert a `Table` to a `DataSet[Tuple]`? 
    One alternative approach I can think of is introducing another TypeInfo which supports
null values (say TupleTypeInfoWithNull) so users can choose to use that when they know/think
that the data may contain null.

If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.

View raw message