predictionio-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pat Ferrel <>
Subject Re: delay of engines
Date Mon, 26 Sep 2016 23:11:34 GMT
If you need the model updated in realtime you are talking about a kappa architecture and PredictionIO
does not support that. It does Lambda only.

The MLlib-based recommenders use live contexts to serve from in-memory copies of the ALS models
but the models themselves were calculated in the background. There are several scaling issues
with doing this but it can be done.

On Sep 25, 2016, at 10:23 AM, Georg Heiler <> wrote:

Wow thanks. This is a great explanation. 

So when I think about writing a spark template for fraud detection (a combination of spark
sql and xgboost ) and would require <1 second latency how should I store the model?

As far as I know startup of YARN jobs e.g. A spark job is too slow for that. 
So it would be great if the model could be evaluated without using the cluster or at least
having a hot spark context similar to spark jobserver or is this possible for <>?

Pat Ferrel < <>> schrieb am So.
25. Sep. 2016 um 18:19:
Gustavo it correct. To put another way both Oryx and PredictionIO are based on what is called
a Lambda Architecture. Loosely speaking this means a potentially  slow background task computes
the predictive “model” but this does not interfere with serving queries. Then when the
model is ready (stored in HDFS or Elasticsearch depending on the template) it is deployed
and the switch happens in microseconds.

In the case of the Universal Recommender the model is stored in Elasticsearch. During `pio
train` the new model in inserted into Elasticsearch and indexed. Once the indexing is done
the index alias used to serve queries is switched to the new index in one atomic action so
there is no downtime and any slow operation happens in the background without impeding queries.

The answer will vary somewhat with the template. Templates that use HDFS for storage may need
to be re-deployed but still the switch from using one to having the new one running is microseconds.

PMML is not relevant to this above discussion and is anyway useless for many model types including
recommenders. If you look carefully at how that is implementing in Oryx you will see that
the PMML “models” for recommenders are not actually stored as PMML, only a minimal description
of where the real data is stored are in PMML. Remember that it has all the problems of XML
including no good way to read in parallel.

On Sep 25, 2016, at 7:47 AM, Gustavo Frederico < <>>

I undestand that the querying for PredictionIO is very fast, as if it
were an Elasticsearch query. Also recall that the training moment is a
different moment that often takes a long time in most learning
systems, but as long as it's not ridiculously long, it doesn't matter
that much.


On Sun, Sep 25, 2016 at 2:30 AM, Georg Heiler < <>>
> Hi predictionIO users,
> I wonder what is the delay of an engine evaluating a model in <>.
> Are the models cached?
> Another project <> is generating PMML which can
be evaluated
> quickly from a production application.
> I believe, that very often the latency until the prediction happens, is
> overlooked. How does predictionIO handle this topic?
> Best regards,
> Georg

View raw message