flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Piotr Nowojski <pi...@data-artisans.com>
Subject Re: Flink flick cancel vs stop
Date Tue, 24 Oct 2017 11:46:36 GMT
I would propose implementations of NewSource to be not blocking/asynchronous. For example something
like

public abstract Future<T> getCurrent();

Which would allow us to perform some certain actions while there are no data available to
process (for example flush output buffers). Something like this came up recently when we were
discussing possible future changes in the network stack. It wouldn’t complicate API by a
lot, since default implementation could just:

public Future<T> getCurrent() {
  return completedFuture(getCurrentBlocking());
}

Another thing to consider is maybe we would like to leave the door open for fetching records
in some batches from the source’s input buffers? Source function (like Kafka) have some
internal buffers and it would be more efficient to read all/deserialise all data present in
the input buffer at once, instead of paying synchronisation/calling virtual method/etc costs
once per each record.

Piotrek

> On 22 Sep 2017, at 11:13, Aljoscha Krettek <aljoscha@apache.org> wrote:
> 
> @Eron Yes, that would be the difference in characterisation. I think technically all
sources could be transformed by that by pushing data into a (blocking) queue and having the
"getElement()" method pull from that.
> 
>> On 15. Sep 2017, at 20:17, Elias Levy <fearsome.lucidity@gmail.com <mailto:fearsome.lucidity@gmail.com>>
wrote:
>> 
>> On Fri, Sep 15, 2017 at 10:02 AM, Eron Wright <eronwright@gmail.com <mailto:eronwright@gmail.com>>
wrote:
>> Aljoscha, would it be correct to characterize your idea as a 'pull' source rather
than the current 'push'?  It would be interesting to look at the existing connectors to see
how hard it would be to reverse their orientation.   e.g. the source might require a buffer
pool.
>> 
>> The Kafka client works that way.  As does the QueueingConsumer used by the RabbitMQ
source.  The Kinesis and NiFi sources also seems to poll. Those are all the bundled sources.
> 


Mime
View raw message