spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Julio Antonio Soto de Vicente <ju...@esbet.es>
Subject Re: Maintaining overall cumulative data in Spark Streaming
Date Thu, 29 Oct 2015 22:20:07 GMT
-dev +user

Hi Sandeep,

Perhaps (flat)mapping values and using an accumulator?


> El 29/10/2015, a las 23:08, Sandeep Giri <sandeep@knowbigdata.com> escribió:
> 
> Dear All,
> 
> If a continuous stream of text is coming in and you have to keep publishing the overall
word count so far since 0:00 today, what would you do?
> 
> Publishing the results for a window is easy but if we have to keep aggregating the results,
how to go about it?
> 
> I have tried to keep an StreamRDD with aggregated count and keep doing a fullouterjoin
but didn't work. Seems like the StreamRDD gets reset.
> 
> Kindly help.
> 
> Regards,
> Sandeep Giri
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org


Mime
View raw message