kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ken Goodhope <kengoodh...@gmail.com>
Subject Re: Kafka/Hadoop consumers and producers
Date Wed, 03 Jul 2013 20:51:36 GMT
We can easily make a Camus configuration that would mimic the functionality
of the hadoop consumer in contrib.  It may require the addition of a
BinaryWritable decoder, and a couple minor code changes.  As for the
producer, we don't have anything in Camus that does what it does.  But
maybe we should at some point.  In the meantime, Gaurav is going to take a
look at what is in contrib and see if it is easily fixed.  I have a feeling
it probably will take minimal effort, and allow us to kick the can down the
road till we get more time to properly address this.

@Jay, would this work for now?

Ken


On Wed, Jul 3, 2013 at 10:57 AM, Felix GV <felix@mate1inc.com> wrote:

> IMHO, I think Camus should probably be decoupled from Avro before the
> simpler contribs are deleted.
>
> We don't actually use the contribs, so I'm not saying this for our sake,
> but it seems like the right thing to do to provide simple examples for this
> type of stuff, no...?
>
> --
> Felix
>
>
> On Wed, Jul 3, 2013 at 4:56 AM, Cosmin Lehene <clehene@adobe.com> wrote:
>
>> If the Hadoop consumer/producers use-case will remain relevant for Kafka
>> (I assume it will), it would make sense to have the core components (kafka
>> input/output format at least) as part of Kafka so that it could be built,
>> tested and versioned together to maintain compatibility.
>> This would also make it easier to build custom MR jobs on top of Kafka,
>> rather than having to decouple stuff from Camus.
>> Also it would also be less confusing for users at least when starting
>> using Kafka.
>>
>> Camus could use those instead of providing it's own.
>>
>> This being said we did some work on the consumer side (0.8 and the new(er)
>> MR API).
>> We could probably try to rewrite them to use Camus or fix Camus or
>> whatever, but please consider this alternative as well.
>>
>> Thanks,
>> Cosmin
>>
>>
>>
>> On 7/3/13 11:06 AM, "Sam Meder" <sam.meder@jivesoftware.com> wrote:
>>
>> >I think it makes sense to kill the hadoop consumer/producer code in
>> >Kafka, given, as you said, Camus and the simplicity of the Hadoop
>> >producer.
>> >
>> >/Sam
>> >
>> >On Jul 2, 2013, at 5:01 PM, Jay Kreps <jay.kreps@gmail.com> wrote:
>> >
>> >> We currently have a contrib package for consuming and producing
>> messages
>> >> from mapreduce (
>> >>
>> >>
>> https://git-wip-us.apache.org/repos/asf?p=kafka.git;a=tree;f=contrib;h=e5
>> >>3e1fb34893e733b10ff27e79e6a1dcbb8d7ab0;hb=HEAD
>> >> ).
>> >>
>> >> We keep running into problems (e.g. KAFKA-946) that are basically due
>> to
>> >> the fact that the Kafka committers don't seem to mostly be Hadoop
>> >> developers and aren't doing a good job of maintaining this code
>> >>(keeping it
>> >> tested, improving it, documenting it, writing tutorials, getting it
>> >>moved
>> >> over to the more modern apis, getting it working with newer Hadoop
>> >> versions, etc).
>> >>
>> >> A couple of options:
>> >> 1. We could try to get someone in the Kafka community (either a current
>> >> committer or not) who would adopt this as their baby (it's not much
>> >>code).
>> >> 2. We could just let Camus take over this functionality. They already
>> >>have
>> >> a more sophisticated consumer and the producer is pretty minimal.
>> >>
>> >> So are there any people who would like to adopt the current Hadoop
>> >>contrib
>> >> code?
>> >>
>> >> Conversely would it be possible to provide the same or similar
>> >> functionality in Camus and just delete these?
>> >>
>> >> -Jay
>> >
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "Camus - Kafka ETL for Hadoop" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to camus_etl+unsubscribe@googlegroups.com.
>> For more options, visit https://groups.google.com/groups/opt_out.
>>
>>
>>
>  --
> You received this message because you are subscribed to the Google Groups
> "Camus - Kafka ETL for Hadoop" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to camus_etl+unsubscribe@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message