edgent-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christofer Dutz <christofer.d...@c-ware.de>
Subject Re: Anyone else mis-interpret the "KafkaConsumer" and "KafkaProducer" all the time?
Date Thu, 22 Mar 2018 16:56:46 GMT
Hi Dale,

Happy to read from you :-)

It was just something I had to explain every time I showed the code for the currently by far
most interesting use-case for my plc4x pocs at the moment (pumping data from a PLC to a Kafka
topic) . So I thought, that if I have to explain it every time, cause people are confused,
then probably we should talk about making things more clear.

Chris

Outlook for Android<https://aka.ms/ghei36> herunterladen

________________________________
From: Dale LaBossiere <dml.apache@gmail.com>
Sent: Thursday, March 22, 2018 5:44:42 PM
To: dev@edgent.apache.org
Subject: Re: Anyone else mis-interpret the "KafkaConsumer" and "KafkaProducer" all the time?

A bit of background…

The Kafka connector is two classes instead of a single KafkaStreams connector (with publish(),subscribe())
because at least a while ago, don’t know if this is still the case, Kafka had two completely
separate classes for a “consumer” and a “producer" each with very different config setup
params. By comparison MQTT has a single MqttClient class (with publish()/subscribe()).

At the time, the decision was to name the Edgent Kafka classes similar to the underlying Kafka
API classes.  Hence KafkaConsumer (~wrapping Kafka’s ConsumerConnector) and KafkaProducer
(~wrapping Kafka’s KafkaProducer).  While not exposed today, it’s conceivable that some
day one could create an Edgent Kafka connector instance by providing a Kafka API class directly
instead of just a config map - e.g., supplying a Kafka KafkaProducer as an arg to the Edgent
KafkaProducer connector's constructor.  So having the names align seems like goodness.

I don’t think the Edgent connectors should be trying to make it unnecessary for a user to
understand or to mask the underlying system’s API… just make it usable, easily usable
for a simple/common cases, in an Edgent topology context (worrying about when to make an actually
external connection, recovering from broken connections / reconnecting, handling common tuple
types).

As for the specific suggestions, I think simply switching the names of Edgent’s KafkaConsumer
and KafkaProducer is a bad idea :-)

Offering KafkaSource and KafkaSink is OK I guess (though probably retaining the current names
for a release or three).  Though I’ll note the Edgent API uses “source” and “sink”
as verbs, which take a Supplier and a Consumer fn as args respectively.  Note Consumer used
in the context with sink.

Alternatively there’s KafkaSubscriber and KafkaPublisher.  While clearer than Consumer/Producer,
I don’t know if they’re any better than Source/Sink.

In the end I guess I don’t feel strongly about it all… though wonder if it’s really
worth the effort in changing.  At least the Edgent connector’s javadoc is pretty good /
clear for the classes and their use... I think :-)

— Dale


> On Mar 20, 2018, at 9:59 PM, vino yang <yanghua1127@gmail.com> wrote:
>
> Hi Chris,
>
> All data processing framework could think it as a *pipeline . *The Edgent's
> point of view, there could be two endpoints :
>
>
>   - source : means data injection;
>   - sink : means data export;
>
> There are many frameworks use this conventional naming rule, such as Apache
> Flume, Apache Flink, Apache Spark(structured streaming) .
>
> I think "KafkaConsumer" could be replaced with "KafkaSource" and
> "KafkaProducer" could be named "KafkaSink".
>
> And middle of the pipeline is the transformation of the data, there are
> many operators to transform data ,such as map, flatmap, filter, reduce...
> and so on.
>
> Vino yang.
> Thanks.
>
> 2018-03-20 20:51 GMT+08:00 Christofer Dutz <christofer.dutz@c-ware.de>:
>
>> Hi,
>>
>> have been using the Kafka integration quite often in the past and one
>> thing I always have to explain when demonstrating code and which seems to
>> confuse everyone seeing the code:
>>
>> I would expect a KafkaConsumer to consume Edgent messages and publish them
>> to Kafka and would expect a KafkaProducer to produce Edgent events.
>>
>> Unfortunately it seems to be the other way around. This seems a little
>> unintuitive. Judging from the continued confusion when demonstrating code
>> eventually it’s worth considering to rename these (swap their names).
>> Eventually even rename them to “KafkaSource” (Edgent Source that consumes
>> Kafka messages and produces Edgent events) and “KafkaConsumer” (Consumes
>> Edgent Events and produces Kafka messages). After all the Classes are in
>> the Edgent namespace and come from the Edgent libs, so the fixed point when
>> inspecting these should be clear. Also I bet no one would be confused if we
>> called something that produces Kafka messages a consumer as there should
>> never be code that handles this from a Kafka point of view AND uses Edgent
>> at the same time.
>>
>> Chris
>>
>>
>>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message