flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From danielsuo <d...@cs.princeton.edu>
Subject Re: Parallelizing DataStream operations on Array elements
Date Fri, 04 Nov 2016 23:53:23 GMT
Till Rohrmann wrote
> I'm not sure whether I grasp the whole problem, but can't you split
> thevector up into the different rows, group by the row index and then
> applysome kind of continuous aggregation or window function?

So I could flatMap my incoming Arrays into (rowId, arrayElement) and gather
them appropriately in a window operation.Here is brief code to describe the
problem:

// Source emits Array[Double]
val input: DataStream[Array[Double]] = env.addSource(new MyArraySource())

// Collect windowSize Array[Double]
input.countWindowAll(windowSize, slideLength)



Now I have a windowSize (representing time) by arrayLength (representing
voxels) matrix. Flink lets me parallelize by time easily, but I'd like to
parallelize by voxel.



--
View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Parallelizing-DataStream-operations-on-Array-elements-tp9911p9916.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.
Mime
View raw message