kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jay Kreps <jay.kr...@gmail.com>
Subject Re: Kafka/Hadoop consumers and producers
Date Wed, 03 Jul 2013 23:48:49 GMT
I guess I am more concerned about the long term than the short term. I
think if you guys want to have all the Hadoop+Kafka stuff then we should
move the producer there and it sounds like it would be possible to get
similar functionality from the existing consumer code. I am not in a rush I
just want to figure out a plan.

The alternative is if there is anyone who is interested in maintaining this
stuff in Kafka. The current state where it is poorly documented and
maintained is not good.

-Jay


On Wed, Jul 3, 2013 at 1:51 PM, Ken Goodhope <kengoodhope@gmail.com> wrote:

> We can easily make a Camus configuration that would mimic the
> functionality of the hadoop consumer in contrib.  It may require the
> addition of a BinaryWritable decoder, and a couple minor code changes.  As
> for the producer, we don't have anything in Camus that does what it does.
> But maybe we should at some point.  In the meantime, Gaurav is going to
> take a look at what is in contrib and see if it is easily fixed.  I have a
> feeling it probably will take minimal effort, and allow us to kick the can
> down the road till we get more time to properly address this.
>
> @Jay, would this work for now?
>
> Ken
>
>
> On Wed, Jul 3, 2013 at 10:57 AM, Felix GV <felix@mate1inc.com> wrote:
>
>> IMHO, I think Camus should probably be decoupled from Avro before the
>> simpler contribs are deleted.
>>
>> We don't actually use the contribs, so I'm not saying this for our sake,
>> but it seems like the right thing to do to provide simple examples for this
>> type of stuff, no...?
>>
>> --
>> Felix
>>
>>
>> On Wed, Jul 3, 2013 at 4:56 AM, Cosmin Lehene <clehene@adobe.com> wrote:
>>
>>> If the Hadoop consumer/producers use-case will remain relevant for Kafka
>>> (I assume it will), it would make sense to have the core components
>>> (kafka
>>> input/output format at least) as part of Kafka so that it could be built,
>>> tested and versioned together to maintain compatibility.
>>> This would also make it easier to build custom MR jobs on top of Kafka,
>>> rather than having to decouple stuff from Camus.
>>> Also it would also be less confusing for users at least when starting
>>> using Kafka.
>>>
>>> Camus could use those instead of providing it's own.
>>>
>>> This being said we did some work on the consumer side (0.8 and the
>>> new(er)
>>> MR API).
>>> We could probably try to rewrite them to use Camus or fix Camus or
>>> whatever, but please consider this alternative as well.
>>>
>>> Thanks,
>>> Cosmin
>>>
>>>
>>>
>>> On 7/3/13 11:06 AM, "Sam Meder" <sam.meder@jivesoftware.com> wrote:
>>>
>>> >I think it makes sense to kill the hadoop consumer/producer code in
>>> >Kafka, given, as you said, Camus and the simplicity of the Hadoop
>>> >producer.
>>> >
>>> >/Sam
>>> >
>>> >On Jul 2, 2013, at 5:01 PM, Jay Kreps <jay.kreps@gmail.com> wrote:
>>> >
>>> >> We currently have a contrib package for consuming and producing
>>> messages
>>> >> from mapreduce (
>>> >>
>>> >>
>>> https://git-wip-us.apache.org/repos/asf?p=kafka.git;a=tree;f=contrib;h=e5
>>> >>3e1fb34893e733b10ff27e79e6a1dcbb8d7ab0;hb=HEAD
>>> >> ).
>>> >>
>>> >> We keep running into problems (e.g. KAFKA-946) that are basically due
>>> to
>>> >> the fact that the Kafka committers don't seem to mostly be Hadoop
>>> >> developers and aren't doing a good job of maintaining this code
>>> >>(keeping it
>>> >> tested, improving it, documenting it, writing tutorials, getting it
>>> >>moved
>>> >> over to the more modern apis, getting it working with newer Hadoop
>>> >> versions, etc).
>>> >>
>>> >> A couple of options:
>>> >> 1. We could try to get someone in the Kafka community (either a
>>> current
>>> >> committer or not) who would adopt this as their baby (it's not much
>>> >>code).
>>> >> 2. We could just let Camus take over this functionality. They already
>>> >>have
>>> >> a more sophisticated consumer and the producer is pretty minimal.
>>> >>
>>> >> So are there any people who would like to adopt the current Hadoop
>>> >>contrib
>>> >> code?
>>> >>
>>> >> Conversely would it be possible to provide the same or similar
>>> >> functionality in Camus and just delete these?
>>> >>
>>> >> -Jay
>>> >
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "Camus - Kafka ETL for Hadoop" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to camus_etl+unsubscribe@googlegroups.com.
>>> For more options, visit https://groups.google.com/groups/opt_out.
>>>
>>>
>>>
>>  --
>> You received this message because you are subscribed to the Google Groups
>> "Camus - Kafka ETL for Hadoop" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to camus_etl+unsubscribe@googlegroups.com.
>> For more options, visit https://groups.google.com/groups/opt_out.
>>
>>
>>
>
>  --
> You received this message because you are subscribed to the Google Groups
> "Camus - Kafka ETL for Hadoop" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to camus_etl+unsubscribe@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message