Ewen Cheslack-Postava created KAFKA-2481:
--------------------------------------------
Summary: Allow copycat sinks to request periodic invocation of put even if no
new data is available
Key: KAFKA-2481
URL: https://issues.apache.org/jira/browse/KAFKA-2481
Project: Kafka
Issue Type: Sub-task
Components: copycat
Reporter: Ewen Cheslack-Postava
Assignee: Ewen Cheslack-Postava
Some connectors will need to perform actions periodically (or more generally, schedule actions
in the future). For example, in an HDFS connector, if you want to roll files every n minutes,
the sink connector needs to make sure it gets control every n minutes, regardless of availbable
data. However, if data isn't flowing into the consumer, we might never invoke {{put(records)}}.
Another variant of this is for connectors that might have an API like the new consumer's where
`poll()` needs to be invoked regularly.
In terms of design, I think there are at least two options:
1. this could be handled via the context, so it is purely opt in to ask to be scheduled for
a put(), and they can specify exactly the timeout
2. alternatively, could be returned by put() since the return type is currently void. we aren't
using a return value right now, but this does mean everyone has to return. also, unclear that
this will always be the only info you want to return
I think 1 is cleaner and doesn't require connector developers who don't care about the feature
to even know about it.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
|