edgent-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dale LaBossiere <dml.apa...@gmail.com>
Subject Re: Anyone else mis-interpret the "KafkaConsumer" and "KafkaProducer" all the time?
Date Thu, 22 Mar 2018 16:44:42 GMT
A bit of background…

The Kafka connector is two classes instead of a single KafkaStreams connector (with publish(),subscribe())
because at least a while ago, don’t know if this is still the case, Kafka had two completely
separate classes for a “consumer” and a “producer" each with very different config setup
params. By comparison MQTT has a single MqttClient class (with publish()/subscribe()).

At the time, the decision was to name the Edgent Kafka classes similar to the underlying Kafka
API classes.  Hence KafkaConsumer (~wrapping Kafka’s ConsumerConnector) and KafkaProducer
(~wrapping Kafka’s KafkaProducer).  While not exposed today, it’s conceivable that some
day one could create an Edgent Kafka connector instance by providing a Kafka API class directly
instead of just a config map - e.g., supplying a Kafka KafkaProducer as an arg to the Edgent
KafkaProducer connector's constructor.  So having the names align seems like goodness.

I don’t think the Edgent connectors should be trying to make it unnecessary for a user to
understand or to mask the underlying system’s API… just make it usable, easily usable
for a simple/common cases, in an Edgent topology context (worrying about when to make an actually
external connection, recovering from broken connections / reconnecting, handling common tuple

As for the specific suggestions, I think simply switching the names of Edgent’s KafkaConsumer
and KafkaProducer is a bad idea :-)

Offering KafkaSource and KafkaSink is OK I guess (though probably retaining the current names
for a release or three).  Though I’ll note the Edgent API uses “source” and “sink”
as verbs, which take a Supplier and a Consumer fn as args respectively.  Note Consumer used
in the context with sink.

Alternatively there’s KafkaSubscriber and KafkaPublisher.  While clearer than Consumer/Producer,
I don’t know if they’re any better than Source/Sink.

In the end I guess I don’t feel strongly about it all… though wonder if it’s really
worth the effort in changing.  At least the Edgent connector’s javadoc is pretty good /
clear for the classes and their use... I think :-)

— Dale

> On Mar 20, 2018, at 9:59 PM, vino yang <yanghua1127@gmail.com> wrote:
> Hi Chris,
> All data processing framework could think it as a *pipeline . *The Edgent's
> point of view, there could be two endpoints :
>   - source : means data injection;
>   - sink : means data export;
> There are many frameworks use this conventional naming rule, such as Apache
> Flume, Apache Flink, Apache Spark(structured streaming) .
> I think "KafkaConsumer" could be replaced with "KafkaSource" and
> "KafkaProducer" could be named "KafkaSink".
> And middle of the pipeline is the transformation of the data, there are
> many operators to transform data ,such as map, flatmap, filter, reduce...
> and so on.
> Vino yang.
> Thanks.
> 2018-03-20 20:51 GMT+08:00 Christofer Dutz <christofer.dutz@c-ware.de>:
>> Hi,
>> have been using the Kafka integration quite often in the past and one
>> thing I always have to explain when demonstrating code and which seems to
>> confuse everyone seeing the code:
>> I would expect a KafkaConsumer to consume Edgent messages and publish them
>> to Kafka and would expect a KafkaProducer to produce Edgent events.
>> Unfortunately it seems to be the other way around. This seems a little
>> unintuitive. Judging from the continued confusion when demonstrating code
>> eventually it’s worth considering to rename these (swap their names).
>> Eventually even rename them to “KafkaSource” (Edgent Source that consumes
>> Kafka messages and produces Edgent events) and “KafkaConsumer” (Consumes
>> Edgent Events and produces Kafka messages). After all the Classes are in
>> the Edgent namespace and come from the Edgent libs, so the fixed point when
>> inspecting these should be clear. Also I bet no one would be confused if we
>> called something that produces Kafka messages a consumer as there should
>> never be code that handles this from a Kafka point of view AND uses Edgent
>> at the same time.
>> Chris

View raw message