incubator-kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Taylor Gautier <tgaut...@tagged.com>
Subject Re: Kafka is live in prod @ 100%
Date Tue, 06 Dec 2011 17:34:00 GMT
We had to isolate topics to specific servers because we are running
several hundred thousand topics in aggregate.

Due to the directory strategy of Kafka it's not feasible to put that
many topics in every host since they reside in a single directory.

An improvement we considered making was to make the data directory
nested which would have alleviated this problem.  We also could have
tried a different filesystem but we weren't confident that would solve
the problem entirely.

The advantage to our solution is that each host in our Kafka tier is
literally share nothing. It will scale horizontally for a long, long
way.

And it's also a contingency plan. Since Kafka was unproven (for us
anyway at the time) it was easier to build smaller components with
less overall functionality and glue them together in a scalable way.
If we had had to we could have out a different message bus in place.
But we didn't want to do that if we could avoid it :)



On Dec 6, 2011, at 9:13 AM, Neha Narkhede <neha.narkhede@gmail.com> wrote:

> Taylor,
>
> This sounds great ! Congratulations on this launch.
>
>>> But basically we have many topics, few messages (relatively) per topic
>
> Can you explain your strategy of mapping topics to brokers ? The default in
> Kafka today is to have all brokers host all topics.
>
>>> An end user browser makes a long-poll event http connection to receive
>  1:1 messages and 1:M messages from a specialized http server we built for
>  this purpose.  1:M messages are delivered from Kafka.
>
> What do you use for receiving 1:1 messages ?
>
> Your use case is interesting and different. It will be great if you add
> relevant details here -
> https://cwiki.apache.org/confluence/display/KAFKA/Powered+By
>
> Thanks,
> Neha
>
>
> On Tue, Dec 6, 2011 at 8:44 AM, Jun Rao <junrao@gmail.com> wrote:
>
>> Hi, Taylor,
>>
>> Thanks for the update. This is great. Could you update your usage in Kafka
>> wiki? Also, do you delete topics online? If so, how do you do that?
>>
>> Jun
>>
>> On Tue, Dec 6, 2011 at 8:30 AM, Taylor Gautier <tgautier@tagged.com>
>> wrote:
>>
>>> I've already mentioned this before, but I wanted to give a quick shout to
>>> let you guys know that our newest game, Deckadence, is 100% live as of
>>> yesterday.
>>>
>>> Check it out at http://www.tagged.com/deckadence.html
>>>
>>> A little about our use case:
>>>
>>>  - Deckadence is a game of buying and selling - or rather trading -
>>>  cards.  Every user on Tagged owns a card.  There are 100M uses on
>> Tagged,
>>>  so that means there are 100M cards to trade.
>>>  - Kafka enables real-time delivery of events in the game
>>>  - An end user browser makes a long-poll event http connection to
>> receive
>>>  1:1 messages and 1:M messages from a specialized http server we built
>> for
>>>  this purpose.  1:M messages are delivered from Kafka.
>>>  - Because of this design, we can publish a message anywhere inside our
>>>  datacenter and send it directly and immediately to any other system
>> that
>>> is
>>>  subscribed to Kafka, or to an end-user browser
>>>  - Every update event for every card is sent to a unique topic that
>>>  represents the users card.
>>>  - When a user is browsing any card or list of cards - say a search
>>>  result - their browser subscribes to all of the cards on screen.
>>>  - The effect of this is that any changes to any card seen on-screen are
>>>  seen in real-time by all users of the game
>>>  - Our primary producers and consumers are PHP and NodeJS, respectively
>>>
>>> Well, I plan to write up more about this use case in the near future.  As
>>> you might have guessed, this is just about as far away from the original
>>> intent of Kafka as you could get - we have PHP that sends messages to
>>> Kafka.  Since it's not good to hold a TCP connection open in PHP, we had
>> to
>>> do some trickery here.  There was no existing Node client so we had to
>>> write our own.  And since there are 100 million users registered on
>> Tagged,
>>> that means we could have in theory 100M topics.  Of course in practice we
>>> have far fewer than that.  One of the main things we currently have to do
>>> is aggressively clean topics.  But basically we have many topics, few
>>> messages (relatively) per topic.  And order matters, so we had to deal
>> with
>>> ensuring that we could handle the number of topics we would create, and
>>> ensure ordered delivery and receipt.
>>>
>>> In the future I have big plans for Kafka, another feature is currently in
>>> private test and will be released to the public soon (it uses Kafka in a
>>> more traditional way).  And we hope to have many more in 2012...
>>>
>>

Mime
View raw message