kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bobby Calderwood (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (KAFKA-3455) Connect custom processors with the streams DSL
Date Sat, 13 May 2017 16:05:04 GMT

    [ https://issues.apache.org/jira/browse/KAFKA-3455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16009389#comment-16009389

Bobby Calderwood commented on KAFKA-3455:

The current Interface definitions for `Processor` and `Transformer` make it a bit difficult
to re-use one for the other.  Specifically, the `void init()` and `void close()` method signatures
are identical, but the `punctuate(long timestamp)` signature differs in a bizarre way: it
has a return type `R` the same as `R Transformer.transform(K key, V value)`, but the docs
specify that `null` must always be returned.

Wouldn't it make sense to DRY these up a bit by either a) changing the method signature of
`R Transformer.punctuate(long timestamp)` to match that of `Processor` (i.e. with a `void`
return type), and/or b) creating another interface encapsulating the lifecycle stuff (`init()`,
`close()` [or just use Java's AutoCloseable], and `punctuate(long timestamp)`) and make `Processor`
and `Transformer` single-method interfaces?  They could either inherit from the common lifecycle-ish
interface, or else compose together with it in implementing classes.

> Connect custom processors with the streams DSL
> ----------------------------------------------
>                 Key: KAFKA-3455
>                 URL: https://issues.apache.org/jira/browse/KAFKA-3455
>             Project: Kafka
>          Issue Type: Sub-task
>          Components: streams
>    Affects Versions:
>            Reporter: Jonathan Bender
>              Labels: user-experience
>             Fix For:
> From the kafka users email thread, we discussed the idea of connecting custom processors
with topologies defined from the Streams DSL (and being able to sink data from the processor).
 Possibly this could involve exposing the underlying processor's name in the streams DSL so
it can be connected with the standard processor API.
> {quote}
> Thanks for the feedback. This is definitely something we wanted to support
> in the Streams DSL.
> One tricky thing, though, is that some operations do not translate to a
> single processor, but a sub-graph of processors (think of a stream-stream
> join, which is translated to actually 5 processors for windowing / state
> queries / merging, each with a different internal name). So how to define
> the API to return the processor name needs some more thinking.
> {quote}

This message was sent by Atlassian JIRA

View raw message