spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From huylv <huy.le...@insight-centre.org>
Subject Where to save intermediate results?
Date Thu, 28 Aug 2014 20:30:07 GMT
Hi,

I'm building a system for near real-time data analytics. My plan is to have
an ETL batch job which calculates aggregations running periodically. User
queries are then parsed for on-demand calculations, also in Spark. Where are
the pre-calculated results supposed to be saved? I mean, after finishing
aggregations, the ETL job will terminate, so caches are wiped out of memory.
How can I use these results to calculate on-demand queries? Or more
generally, could you please give me a good way to organize the data flow and
jobs in order to achieve this?

I'm new to Spark so sorry if this might sound like a dumb question.

Thank you.
Huy



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Where-to-save-intermediate-results-tp13062.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Mime
View raw message