flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ufuk Celebi <...@apache.org>
Subject Re: flink batch data processing
Date Tue, 26 Jul 2016 11:44:45 GMT
Are you using the DataSet or DataStream API?

Yes, most Flink transformations operate on single tuples, but you can
work around it:
- You could write a custom source function, which emits records that
contain X points
(https://ci.apache.org/projects/flink/flink-docs-master/apis/batch/index.html#data-sources)
- You can use a mapPartition
(https://ci.apache.org/projects/flink/flink-docs-master/apis/batch/dataset_transformations.html#mappartition)
or FlatMap (https://ci.apache.org/projects/flink/flink-docs-master/apis/batch/dataset_transformations.html#flatmap)
function and create the batches manually.

Does this help?

On Fri, Jul 22, 2016 at 7:21 PM, Paul Joireman <paul.joireman@physiq.com> wrote:
> I'm evaluating for some processing batches of data.  As a simple example say
> I have 2000 points which I would like to pass through an FIR filter using
> functionality provided by the Python scipy libraryjk.  The scipy filter is a
> simple function which accepts a set of coefficients and the data to filter
> and returns the data.   Is is possible to create a transformation to handle
> this in flink?  It seems flink transformations are applied on a point by
> point basis but I may be missing something.
>
> Paul

Mime
View raw message