spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Cody Koeninger <c...@koeninger.org>
Subject Re: using Spark Streaming with Kafka 0.9/0.10
Date Wed, 16 Nov 2016 03:47:59 GMT
Generating / defining an RDDis not the same thing as running the
compute() method of an rdd .  The direct stream definitely runs kafka
consumers on the executors.

If you want more info, the blog post and video linked from
https://github.com/koeninger/kafka-exactly-once refers to the 0.8
implementation, but the general design is similar for the 0.10
version.

I think the likelihood of an official release supporting 0.9 is fairly
slim at this point, it's a year out of date and wouldn't be a drop-in
dependency change.


On Tue, Nov 15, 2016 at 5:50 PM, aakash aakash <email2aakash@gmail.com> wrote:
>
>
>> You can use the 0.8 artifact to consume from a 0.9 broker
>
> We are currently using "Camus" in production and one of the main goal to
> move to Spark is to use new Kafka Consumer API  of Kafka 0.9 and in our case
> we need the security provisions available in 0.9, that why we cannot use 0.8
> client.
>
>> Where are you reading documentation indicating that the direct stream
> only runs on the driver?
>
> I might be wrong here, but I see that new kafka+Spark stream code extend the
> InputStream and its documentation says : Input streams that can generate
> RDDs from new data by running a service/thread only on the driver node (that
> is, without running a receiver on worker nodes)
>
> Thanks and regards,
> Aakash Pradeep
>
>
> On Tue, Nov 15, 2016 at 2:55 PM, Cody Koeninger <cody@koeninger.org> wrote:
>>
>> It'd probably be worth no longer marking the 0.8 interface as
>> experimental.  I don't think it's likely to be subject to active
>> development at this point.
>>
>> You can use the 0.8 artifact to consume from a 0.9 broker
>>
>> Where are you reading documentation indicating that the direct stream
>> only runs on the driver?  It runs consumers on the worker nodes.
>>
>>
>> On Tue, Nov 15, 2016 at 10:58 AM, aakash aakash <email2aakash@gmail.com>
>> wrote:
>> > Re-posting it at dev group.
>> >
>> > Thanks and Regards,
>> > Aakash
>> >
>> >
>> > ---------- Forwarded message ----------
>> > From: aakash aakash <email2aakash@gmail.com>
>> > Date: Mon, Nov 14, 2016 at 4:10 PM
>> > Subject: using Spark Streaming with Kafka 0.9/0.10
>> > To: user-subscribe@spark.apache.org
>> >
>> >
>> > Hi,
>> >
>> > I am planning to use Spark Streaming to consume messages from Kafka 0.9.
>> > I
>> > have couple of questions regarding this :
>> >
>> > I see APIs are annotated with @Experimental. So can you please tell me
>> > when
>> > are we planning to make it production ready ?
>> > Currently, I see we are using Kafka 0.10 and so curious to know why not
>> > we
>> > started with 0.9 Kafka instead of 0.10 Kafka. As I see 0.10 kafka client
>> > would not be compatible with 0.9 client since there are some changes in
>> > arguments in consumer API.
>> > Current API extends InputDstream and as per document it means RDD will
>> > be
>> > generated by running a service/thread only on the driver node instead of
>> > worker node. Can you please explain to me why we are doing this and what
>> > is
>> > required to make sure that it runs on worker node.
>> >
>> >
>> > Thanks in advance !
>> >
>> > Regards,
>> > Aakash
>> >
>
>

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscribe@spark.apache.org


Mime
View raw message