spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joanne Contact <joannenetw...@gmail.com>
Subject Is spark streaming +MlLib for online learning?
Date Tue, 25 Nov 2014 00:40:46 GMT
Hi Gurus,

Sorry for my naive question. I am new.

I seemed to read somewhere that spark is still batch learning, but spark
streaming could allow online learning.

I could not find this on the website now.

http://spark.apache.org/docs/latest/streaming-programming-guide.html

I know MLLib uses incremental or iterative algorithms, I wonder if this is
also true between batches of spark streaming.

So the question is: say, when I call MLLib linear regression, does the
training use one batch data as training data, if yes, then the model update
between batches is already taken care of? That is, the model will
eventually use all data that arrived from the beginning until current time
of scoring as the training data, or the model only use data coming in the
past limited number of batches as training data?


Many thanks!

J

Mime
View raw message