kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ewen Cheslack-Postava (JIRA)" <j...@apache.org>
Subject [jira] [Created] (KAFKA-2481) Allow copycat sinks to request periodic invocation of put even if no new data is available
Date Thu, 27 Aug 2015 17:10:45 GMT
Ewen Cheslack-Postava created KAFKA-2481:
--------------------------------------------

             Summary: Allow copycat sinks to request periodic invocation of put even if no
new data is available
                 Key: KAFKA-2481
                 URL: https://issues.apache.org/jira/browse/KAFKA-2481
             Project: Kafka
          Issue Type: Sub-task
          Components: copycat
            Reporter: Ewen Cheslack-Postava
            Assignee: Ewen Cheslack-Postava


Some connectors will need to perform actions periodically (or more generally, schedule actions
in the future). For example, in an HDFS connector, if you want to roll files every n minutes,
the sink connector needs to make sure it gets control every n minutes, regardless of availbable
data. However, if data isn't flowing into the consumer, we might never invoke {{put(records)}}.
Another variant of this is for connectors that might have an API like the new consumer's where
`poll()` needs to be invoked regularly.

In terms of design, I think there are at least two options:
1. this could be handled via the context, so it is purely opt in to ask to be scheduled for
a put(), and they can specify exactly the timeout
2. alternatively, could be returned by put() since the return type is currently void. we aren't
using a return value right now, but this does mean everyone has to return. also, unclear that
this will always be the only info you want to return

I think 1 is cleaner and doesn't require connector developers who don't care about the feature
to even know about it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message