flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Prabhu V <vpra...@gmail.com>
Subject Re: Flink Batch Processing with Kafka
Date Wed, 03 Aug 2016 13:03:54 GMT
If your environment is not kerberized (or if you can offord to restart the
job every 7 days),  a checkpoint enabled, flink job with windowing and the
count trigger, would be ideal for your requirement.

Check the api's on flink windows.

I had something like this that worked


where stream is a data stream using the kafka connector,
"function" is where you would have the "send it to my microservice" part.
keyBy(0) is the aggregation based on a key field

You could look up the individual methods in the api.


On Wed, Aug 3, 2016 at 5:21 AM, Alam, Zeeshan <Zeeshan.Alam@fmr.com> wrote:

> Hi,
> Flink works very well with Kafka if you wish to stream data. Following  is
> how I am streaming data with Kafka and Flink.
> FlinkKafkaConsumer08<Event> kafkaConsumer = *new* FlinkKafkaConsumer08<>(
> *KAFKA_AVRO_TOPIC*, avroSchema, properties);
> DataStream<Event> messageStream = env.addSource(kafkaConsumer);
> Is there a way to do a micro batch operation on the data coming from
> Kafka? What I want to do is to *reduce* or *aggregate* the events coming
> from Kafka. For instance I am getting 40000 events per second from Kafka
> and what I want is to group 2000 events into one and send it to my
> microservice for further processing. Can I use the *Flink DataSet API*
> for this or should I go with Spark or some other framework?
> Thanks & Regards
> Zeeshan Alam

View raw message