kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bruno D. Rodrigues" <bruno.rodrig...@litux.org>
Subject Re: Is there a way to pull out kafka metadata from zookeeper?
Date Sat, 12 Oct 2013 12:17:53 GMT
What I understood from Neha is that querying the metadata from one (any) kafka will return
everything in one go. Querying the data indirectly via zookeeper would be more complicated
and would involve more requests between the zookeeper and the brokers before being able to
answer back.

What I suggested was *not* to get the metadata from the zookeeper, but simply the broker list,
to avoid having a list brokers on the configuration. Or more concretely, IMHO the producer
and consumers should be consistent and allow either setting a list of zookeepers or a list
of brokers.

In both cases, independently of whatever action they need to do, they could ask a random broker
for the information, or ask a random zookeeper for the list of brokers, and then a random

This for me would make it consistent and more professional. It would support using zookeeper
or not. Would allow the developers to decide for themselves if they want to point to the brokers
or to the zookeepers. But more importantly, if zookeeper is there to help the coordination
of the brokers, the producers and consumers should rely on them to discover what they need
to discover - the list of brokers. 

This way, one side pointing one way and the other side pointing a different way got me quite

A 12/10/2013, às 01:16, "hsy541@gmail.com" <hsy541@gmail.com> escreveu:

> That's why I'm asking, I would like to see a kafka zookeeper client api to
> get TopicMetadata instead of my own hacky way to query the zookeeper
> Thanks!
> Best,
> Siyuan
> On Fri, Oct 11, 2013 at 4:00 PM, Bruno D. Rodrigues <
> bruno.rodrigues@litux.org> wrote:
>> Why not ask zookeeper for the list of brokers and then ask a random
>> broker for the metadata (and repeat if the broker is down), even if
>> it's two calls.
>> Heck it already does unnecessary connections. It connects to a broker,
>> gets the metadata, disconnects, and then connects again for the data.
>> If it's already assumed a producer or consumer will take some seconds
>> until ready, what is another call gonna prejudice the flow.
>> Then producers and consumers would then be consistently configured. Or
>> allow the producers to also go to a broker instead of zookeeper.
>> This way the consumer needs to know and hardcode at least one node.
>> The node can fail. It can be changed.
>> I thought zookeeper served to abstract this kind of complexity
>> --
>> Bruno Rodrigues
>> Sent from my iPhone
>> No dia 11/10/2013, às 22:40, Neha Narkhede <neha.narkhede@gmail.com>
>> escreveu:
>>>>> For each consumer consumes different
>>> topic/replica I have to specify those 20 brokers and go over all of them
>> to
>>> know which broker is alive. And even worse how about I dynamically add
>> new
>>> broker into the cluster and remove the old one
>>> TopicMetadataRequest is a batch API and you can get metadata information
>>> for either a list of all topics or all topics in the cluster, if you
>>> specify an empty list of topics. Adding a broker is not a problem since
>> the
>>> metadata request also returns the list of brokers in a cluster. The
>> reason
>>> this is better than reading from zookeeper is because the same operation
>>> would require multiple zookeeper roundtrips, instead of a single
>>> TopicMetadataRequest roundtrip to some kafka broker.
>>> Thanks,
>>> Neha
>>>> On Fri, Oct 11, 2013 at 11:30 AM, hsy541@gmail.com <hsy541@gmail.com>
>> wrote:
>>>> Thanks guys!
>>>> But I feel weird. Assume I have 20 brokers for 10 different topics with
>> 2
>>>> partitions and  2 replicas for each. For each consumer consumes
>> different
>>>> topic/replica I have to specify those 20 brokers and go over all of
>> them to
>>>> know which broker is alive. And even worse how about I dynamically add
>> new
>>>> broker into the cluster and remove the old one. I think it's nice to
>> have a
>>>> way to get metadata from zookeeper(centralized coordinator?) directly.
>>>> Best,
>>>> Siyuan
>>>> On Fri, Oct 11, 2013 at 9:12 AM, Neha Narkhede <neha.narkhede@gmail.com
>>>>> wrote:
>>>>> If, for some reason, you don't have access to a virtual IP or load
>>>>> balancer, you need to round robin once through all the brokers before
>>>>> failing a TopicMetadataRequest. So unless all the brokers in your
>> cluster
>>>>> are down, this should not be a problem.
>>>>> Thanks,
>>>>> Neha
>>>>> On Thu, Oct 10, 2013 at 10:50 PM, hsy541@gmail.com <hsy541@gmail.com>
>>>>> wrote:
>>>>>> Hi guys,
>>>>>> I'm trying to maintain a bunch of simple kafka consumer to consume
>>>>> messages
>>>>>> from brokers. I know there is a way to send TopicMetadataRequest
>>>>> broker
>>>>>> and get the response from the broker. But you have to specify the
>>>> broker
>>>>>> list to query the information. But broker might not be available
>>>> because
>>>>> of
>>>>>> some failure. My question is is there any api I can call and query
>>>> broker
>>>>>> metadata for topic/partition directly from zookeeper? I know I can
>>>> query
>>>>>> that information using zookeeper API. But that's not friendly
>>>>> datastructure
>>>>>> like the TopicMetadata/PartitionMetadata.  Thank you!
>>>>>> Best,
>>>>>> Siyuan

View raw message