kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marina Popova <ppine7...@protonmail.com>
Subject Re: Comparing Pulsar and Kafka: unified queuing and streaming
Date Tue, 05 Dec 2017 14:58:14 GMT
Hi,
I don't think it would be such a great idea to start modifying the very foundation of Kafka's
design to accommodate more and more extra use cases.
Kafka because so widely adopted and popular because its creator made a brilliant decision
to make it "dumb broker - smart consumer" type of the system, where there is no to minimal
dependencies between Kafka brokers and Consumers. This is what make Kafka blazingly fast and
truly scalable - able to handle thousands of Consumers with no impact on performance.

One unfortunate consequence of becoming so popular - is that more and more people are trying
to fit Kafka into their architectures not because it really fits, but because everybody else
is doing so :) And this causes many requests to support more and more reacher functionality
to be added to Kafka - like transactional messages, more complex acks, centralized consumer
management, etc.

If you really need those feature - there are other systems that are designed for that.

I truly worry that if all those changes are added to Core Kafka - it will become just another
"do it all" enterprise-level monster that will be able to do it all but at a price of mediocre
performance and ten-fold increased complexity (and, thus, management and possibility of bugs).
Sure, there has to be innovation and new features added - but maybe those that require major
changes to the Kafka's core principles should go into separate frameworks, plug-ing (like
Connectors) or something in that line, rather that packing it all into the Core Kafka.

Just my 2 cents :)

Marina

Sent with [ProtonMail](https://protonmail.com) Secure Email.

> -------- Original Message --------
> Subject: Re: Comparing Pulsar and Kafka: unified queuing and streaming
> Local Time: December 4, 2017 2:56 PM
> UTC Time: December 4, 2017 7:56 PM
> From: jason@confluent.io
> To: dev@kafka.apache.org
> Kafka Users <users@kafka.apache.org>
>
> Hi Khurrum,
>
> Thanks for sharing the article. I think one interesting aspect of Pulsar
> that stands out to me is its notion of a subscription and how it impacts
> message retention. In Kafka, consumers are more loosely coupled and
> retention is enforced independently of consumption. There are some
> scenarios I can imagine where the tighter coupling might be beneficial. For
> example, in Kafka Streams, we often use intermediate topics to store the
> data in one stage of the topology's computation. These topics are
> exclusively owned by the application and once the messages have been
> successfully received by the next stage, we do not need to retain them
> further. But since consumption is independent of retention, we either have
> to choose a large retention time and deal with some temporary storage waste
> or we use a low retention time and possibly lose some messages during an
> outage.
>
> We have solved this problem to some extent in Kafka by introducing an API
> to delete the records in a partition up to a certain offset, but this
> effectively puts the burden of this use case on clients. It would be
> interesting to consider whether we could do something like Pulsar in the
> Kafka broker. For example, we have a consumer group coordinator which is
> able to track the progress of the group through its committed offsets. It
> might be possible to extend it to automatically delete records in a topic
> after offsets are committed if the topic is known to be exclusively owned
> by the consumer group. We already have the DeleteRecords API that need, so
> maybe this is "just" a matter of some additional topic metadata. I'd be
> interested to hear whether this kind of use case is common among our users.
>
> -Jason
>
> On Sun, Dec 3, 2017 at 10:29 PM, Khurrum Nasim khurrumnasimm@gmail.com
> wrote:
>
>> Dear Kafka Community,
>> I happened to read this blog post comparing the messaging model between
>> Apache Pulsar and Apache Kafka. It sounds interesting. Apache Pulsar claims
>> to unify streaming (kafka) and queuing (rabbitmq) in one unified API.
>> Pulsar also seems to support Kafka API. Have anyone taken a look at Pulsar?
>> How does the community think about this? Pulsar is also an Apache project.
>> Is there any collaboration can happen between these two projects?
>> https://streaml.io/blog/pulsar-streaming-queuing/
>> BTW, I am a Kafka user, loving Kafka a lot. Just try to see what other
>> people think about this.
>>
>> - KN
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message