flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Maximilian Michels <...@apache.org>
Subject The null in Flink
Date Mon, 15 Jun 2015 15:45:26 GMT
Hi everyone,

I'm seeing a lot of null value related pull requests nowadays, like these:

https://github.com/apache/flink/pull/780
https://github.com/apache/flink/pull/831
https://github.com/apache/flink/pull/834

It used to be the case that null values were simply not supported by Flink.
Recently, Flink supports null values for some components. Now I'm wondering
what the current state of null values in Flink is. While ignoring null
values might be a good for not crashing your programs, null values are
generally a bad way of signaling empty values for which better strategies
are available. My intuition would be that it is a bit evil to support them
in DataSets.

Just to give an idea what null values could cause in Flink: DataSet.count()
returns the number of elements of all values in a Dataset (null or not)
while #834 would ignore null values and aggregate the DataSet without them.

Best,
Max

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message