flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andra Lungu <lungu.an...@gmail.com>
Subject Caching information from a stream
Date Wed, 28 Oct 2015 15:30:05 GMT
Hey guys!

I've been thinking about this one today:

Say you have a stream of data in the form of (id, value) - This will
evidently be a DataStream of Tuple2.
I need to cache this data in some sort of static stream (perhaps even a
DataSet).
Then, if in the input stream, I see an id that was previously stored, I
should update its value with the most recent entry.

On an example:

1, 3
2, 5
6, 7
1, 5

The value cached for the id 1 should be 5.

How would you recommend caching the data? And what would be used for the
update? A join function?

As far as I see things, you cannot really combine DataSets with DataStreams
although a DataSet is, in essence, just a finite stream.
If this can indeed be done, some pseudocode would be nice :)

Thanks!
Andra

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message