spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Cody Koeninger <c...@koeninger.org>
Subject Re: Spark Streaming - Kafka Direct Approach: re-compute from specific time
Date Wed, 25 May 2016 16:11:40 GMT
There's an overloaded createDirectStream method that takes a map from
topicpartition to offset for the starting point of the stream.

On Wed, May 25, 2016 at 9:59 AM, trung kien <kientt86@gmail.com> wrote:
> Thank Cody.
>
> I can build the mapping from time ->offset. However how can i pass this
> offset to Spark Streaming job using that offset? ( using Direct Approach)
>
> On May 25, 2016 9:42 AM, "Cody Koeninger" <cody@koeninger.org> wrote:
>>
>> Kafka does not yet have meaningful time indexing, there's a kafka
>> improvement proposal for it but it has gotten pushed back to at least
>> 0.10.1
>>
>> If you want to do this kind of thing, you will need to maintain your
>> own index from time to offset.
>>
>> On Wed, May 25, 2016 at 8:15 AM, trung kien <kientt86@gmail.com> wrote:
>> > Hi all,
>> >
>> > Is there any way to re-compute using Spark Streaming - Kafka Direct
>> > Approach
>> > from specific time?
>> >
>> > In some cases, I want to re-compute again from specific time (e.g
>> > beginning
>> > of day)? is that possible?
>> >
>> >
>> >
>> > --
>> > Thanks
>> > Kien

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Mime
View raw message