spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Julio Antonio Soto de Vicente <>
Subject Re: Maintaining overall cumulative data in Spark Streaming
Date Thu, 29 Oct 2015 22:20:07 GMT
-dev +user

Hi Sandeep,

Perhaps (flat)mapping values and using an accumulator?

> El 29/10/2015, a las 23:08, Sandeep Giri <> escribió:
> Dear All,
> If a continuous stream of text is coming in and you have to keep publishing the overall
word count so far since 0:00 today, what would you do?
> Publishing the results for a window is easy but if we have to keep aggregating the results,
how to go about it?
> I have tried to keep an StreamRDD with aggregated count and keep doing a fullouterjoin
but didn't work. Seems like the StreamRDD gets reset.
> Kindly help.
> Regards,
> Sandeep Giri

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message