kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Joe Stein (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (KAFKA-240) implement new producer and consumer request format
Date Wed, 18 Jan 2012 14:48:40 GMT

    [ https://issues.apache.org/jira/browse/KAFKA-240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13188479#comment-13188479
] 

Joe Stein commented on KAFKA-240:
---------------------------------

So, one possibility I am looking at for the data [<topic_data_struct>] is to use ByteArrayOutputStream
and ObjectOutputStream to make the List[WiredTopic] get into the byte buffer.  My concern
here though is non JVM clients not being able to support it.

Another option is making this new format Thrift based (or protobuffer, etc) though I also
appreciate and understand that adding yet another dependency yet it allows for language independence
for what is now a complex data structure (not the simple 3 fields that exist now).  The entire
format could be a Thrift object and the serialized bytes become a single put into the byte
buffer

The last option I have considered is too basically loop through the data structures and increase
the ByteBuffer position structure.  So after the ack timeout we could store some hint in the
byte buffer to the [] of topic -> [partition,message]  e.g. we have 2 topics being published
too with one of those topics having 2 partitions with a message se and the other having 3
paritions/messageet.  So we could create some "hint" to know the number of topics and each
of the counts of paritions/message for each topic (in this case 2,2,3) and "put" the topic1,
partitionA, messageA, partitionB,messageB, topic2, partitionC, messageC, partitionD,messageD,
partitionE,messageE. The draw back here is some nuance complexity (bordering on esoteric)
to take the object model break it out, store it and then pulling the stored value (based on
the hint so we know what position is topic and which are partition.  The "hint" could be a
delimited string maybe (if this is the approach that is adopted) count of topics and for each
topic then the count of partition for those topics.  2,2,3 split on , [0] is the count of
topics [1] is the count of partition/message for topic1 and [2] is the count of partition/message
for topic 2

might be some other options here?  I am missing something/over complicating? my preference
is the thrift approach but I appreciate the "hint" approach also and I would be quite alright
with that too.... it works with no additive dependency drawback is a tad harder to have client
code ("driver") adoption 

thoughts?  
                
> implement new producer and consumer request format
> --------------------------------------------------
>
>                 Key: KAFKA-240
>                 URL: https://issues.apache.org/jira/browse/KAFKA-240
>             Project: Kafka
>          Issue Type: Sub-task
>          Components: core
>            Reporter: Jun Rao
>             Fix For: 0.8
>
>
> We want to change the producer/consumer request/response format according to the discussion
in the following wiki:
> https://cwiki.apache.org/confluence/display/KAFKA/New+Wire+Format+Proposal

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

Mime
View raw message