kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Arthur <mum...@gmail.com>
Subject Re: Kafka REST interface
Date Mon, 10 Sep 2012 13:49:09 GMT

Anyone have feedback on this approach?


On Aug 24, 2012, at 12:37 PM, David Arthur wrote:

> Here is an initial pass at a Kafka REST proxy (in Scala)
> https://github.com/mumrah/kafka/blob/rest/contrib/rest-proxy/src/main/scala/RESTServer.scala
> The basic gist is:
> * Jetty for webserver
> * Messages are strings
> * GET /topic/group to get a message (timeout after 1s)
> * POST /topic, the request body is the message
> * One consumer thread per topic+group
> Be wary, many things are hard coded at this point (port numbers, etc). Obviously, this
will need to change. Also, I haven't the slightest idea how to setup/use sbt properly, so
I just checked in the libs.
> Feedback is welcome in this thread or on Github.  Be gentle please, this is my first
go at Scala
> -David
> On Aug 12, 2012, at 10:39 AM, Taylor Gautier wrote:
>> Jay I agree with you 100%.
>> At Tagged we have implemented a proxy for various internal reasons (
>> primarily to act as a high performance relay from PHP to Kafka). It's
>> implemented in Node.js (JavaScript)
>> Currently it services UDP packets encoded in binary but it could
>> easily be modified to accept http also since Node support for http is
>> pretty simple.
>> If others are interested in maintaining something like this we could
>> consider adding this to the public domain along side the already
>> existing Node.js client implementation.
>> On Aug 10, 2012, at 3:51 PM, Jay Kreps <jay.kreps@gmail.com> wrote:
>>> My personal preference would be to have only a single protocol in kafka
>>> core. I have been down the multiple protocol route and my experience was
>>> that it adds a lot of burden for each change that needs to be made and a
>>> lot of complexity to abstract over the different protocols. From the point
>>> of view of a user they are generally a bit agnostic as to how bytes are
>>> sent back and forth provided it is reliable and easily implementable in any
>>> language. Generally they care more about the quality of the client in their
>>> language of choice.
>>> My belief is that the main benefit of REST is ease of implementing a
>>> client. But currently the biggest barrier is really the use of zk and
>>> fairly thick consumer design. So I think the current thinking is that we
>>> should focus on thinning that out and removing the client-side zk
>>> dependency. I actually don't think TCP is a huge burden if the protocol is
>>> simple, and there are actually some advantages (for example the consumer
>>> needs to consume from multiple servers so select/poll/epoll is natural but
>>> this is not always available from HTTP client libraries).
>>> Basically this is an area where I think it is best to pick one way and
>>> really make it really bullet proof rather than providing lots of options.
>>> In some sense each option tends to increase the complexity of testing
>>> (since now there are many combinations to try) and also of implementation
>>> (since now a lot things that were concrete now need to be abstracted away).
>>> So from this perspective I would prefer a standalone proxy that could
>>> evolve independently rather than retro-fitting the current socket server to
>>> handle other protocols. There will be some overhead for the extra hop, but
>>> then there is some overhead for HTTP itself.
>>> This is just my personal opinion, it would be great to hear what other
>>> think.
>>> -Jay
>>> On Mon, Aug 6, 2012 at 5:39 AM, David Arthur <mumrah@gmail.com> wrote:
>>>> I'd be happy to collaborate on this, though it's been a while since I've
>>>> used PHP.
>>>> From what it looks like, what you have is a true proxy that runs outside
>>>> of Kafka and translates some REST routes into Kafka client calls. This
>>>> sounds more in line with what the project page describes. What I have
>>>> proposed is more like a translation layer between some REST routes and
>>>> FetchRequests. In this case the client is responsible for managing offsets.
>>>> Using the consumer groups and ZooKeeper would be another nice way of
>>>> consuming messages (which is probably more like what you have).
>>>> Any maintainers have feedback on this?
>>>> On Aug 3, 2012, at 4:13 PM, Jonathan Creasy wrote:
>>>>> I have an internal one working and was hoping to have it open sourced
>>>>> the next week. The one at Box is based on the CodeIgniter framework,
>>>>> have about 45 RESTful interfaces built on this framework so I just put
>>>>> together another one for Kafka.
>>>>> Here are my notes, these were pre-dev so may be a little different than
>>>>> what we ended up with.
>>>>> https://cwiki.apache.org/confluence/display/KAFKA/Restful+API+Proposal
>>>>> I will read yours later this afternoon, we should work together.
>>>>> -Jonathan
>>>>> On Fri, Aug 3, 2012 at 7:41 AM, David Arthur <mumrah@gmail.com>
>>>>>> I'd like to tackle this project (assuming it hasn't been started
>>>>>> I wrote up some initial thoughts here: https://gist.github.com/3248179
>>>>>> TLDR;  use Range header for specifying offsets, simple URIs like
>>>>>> /kafka/topics/[topic]/[partition], use for a simple transport of
>>>>>> and/or represent the messages as some media type (text, json, xml)
>>>>>> Feedback is most welcome (in the Gist or in this thread).
>>>>>> Cheers!
>>>>>> -David

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message