flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andres R. Masegosa " <and...@cs.aau.dk>
Subject How to create a stream of data batches
Date Fri, 04 Sep 2015 11:00:59 GMT
Hi,

I'm trying to code some machine learning algorithms on top of flink such
as a variational Bayes learning algorithms. Instead of working at a data
element level (i.e. using map transformations), it would be far more
efficient to work at a "batch of elements" levels (i.e. I get a batch of
elements and I produce some output).

I could code that using "mapPartition" function. But I can not control
the size of the partition, isn't?

Is there any way to transform a stream (or DataSet) of elements in a
stream (or DataSet) of data batches with the same size?


Thanks for your support,
Andres

Mime
View raw message