predictionio-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Georg Heiler <>
Subject Re: delay of engines
Date Tue, 27 Sep 2016 07:04:42 GMT
For me, the latency of model evaluation is more important than training
latency. This holds true for retraining / model updates as well. I would
say that the "evaluation / prediction" latency is the most critical one.

Your point regarding 3) is very interesting for me. I have 2 types of data:

   - low volume information about a customer
   - high volume usage data

The high volume data will require aggregation (e.g. spark SQL) prior the
model can be evaluated. Here, a higher latency would be OK.
Regarding the low volume data: some features will require some sort of SQL
for extraction.

Kenneth Chan <> schrieb am Di., 27. Sep. 2016 um
07:43 Uhr:

> re: kappa vs lambda.
> as far as i understand, at high-level, kappa is more like a subset of
> lambda (ie. only keep the real-time part)
> Gerog, would you be more specific when you talk about "latency requirement"
> 1. latency of training a model with new data?
> 2. latency of deploy new model ? or
> 3. latency of getting predicted result using the previously trained model
> given a query?
> if you are talking about 3, depending on how your model calculates the
> prediction. It doesn't need spark if the model can be fit into memory.
> On Mon, Sep 26, 2016 at 9:41 PM, Georg Heiler <>
> wrote:
>> Hi Donald
>> For me it is more about stacking and meta learning. The selection of
>> models could be performed offline.
>> But
>> 1 I am concerned about keeping the model up to date - retraining
>> 2 having some sort of reinforcement learning to improve / punish based on
>> correctness of new ground truth 1/month
>> 3 to have Very quick responses. Especially more like an evaluation of a
>> random forest /gbt / nnet without staring a yearn job.
>> Thank you all for the feedback so far
>> Best regards to
>> Georg
>> Donald Szeto <> schrieb am Di. 27. Sep. 2016 um 06:34:
>>> Sorry for side-tracking. I think Kappa architecture is a promising
>>> paradigm, but including batch processing from the canonical store to the
>>> serving layer store should still be necessary. I believe this somewhat
>>> hybrid Kappa-Lambda architecture would be generic enough to handle many use
>>> cases. If this is something that sounds good to everyone, we should drive
>>> PredictionIO to that direction.
>>> Georg, are you talking about updating an existing model in different
>>> ways, evaluate them, and select one within a time constraint, say every 1
>>> second?
>>> On Mon, Sep 26, 2016 at 4:11 PM, Pat Ferrel <>
>>> wrote:
>>>> If you need the model updated in realtime you are talking about a kappa
>>>> architecture and PredictionIO does not support that. It does Lambda only.
>>>> The MLlib-based recommenders use live contexts to serve from in-memory
>>>> copies of the ALS models but the models themselves were calculated in the
>>>> background. There are several scaling issues with doing this but it can be
>>>> done.
>>>> On Sep 25, 2016, at 10:23 AM, Georg Heiler <>
>>>> wrote:
>>>> Wow thanks. This is a great explanation.
>>>> So when I think about writing a spark template for fraud detection (a
>>>> combination of spark sql and xgboost ) and would require <1 second latency
>>>> how should I store the model?
>>>> As far as I know startup of YARN jobs e.g. A spark job is too slow for
>>>> that.
>>>> So it would be great if the model could be evaluated without using the
>>>> cluster or at least having a hot spark context similar to spark jobserver
>>>> or <> is this possible for
>>>> Regards,
>>>> Georg
>>>> Pat Ferrel <> schrieb am So. 25. Sep. 2016 um
>>>> 18:19:
>>>>> Gustavo it correct. To put another way both Oryx and PredictionIO are
>>>>> based on what is called a Lambda Architecture. Loosely speaking this
>>>>> a potentially  slow background task computes the predictive “model”
>>>>> this does not interfere with serving queries. Then when the model is
>>>>> (stored in HDFS or Elasticsearch depending on the template) it is deployed
>>>>> and the switch happens in microseconds.
>>>>> In the case of the Universal Recommender the model is stored in
>>>>> Elasticsearch. During `pio train` the new model in inserted into
>>>>> Elasticsearch and indexed. Once the indexing is done the index alias
>>>>> to serve queries is switched to the new index in one atomic action so
>>>>> is no downtime and any slow operation happens in the background without
>>>>> impeding queries.
>>>>> The answer will vary somewhat with the template. Templates that use
>>>>> HDFS for storage may need to be re-deployed but still the switch from
>>>>> one to having the new one running is microseconds.
>>>>> PMML is not relevant to this above discussion and is anyway useless
>>>>> for many model types including recommenders. If you look carefully at
>>>>> that is implementing in Oryx you will see that the PMML “models”
>>>>> recommenders are not actually stored as PMML, only a minimal description
>>>>> where the real data is stored are in PMML. Remember that it has all the
>>>>> problems of XML including no good way to read in parallel.
>>>>> On Sep 25, 2016, at 7:47 AM, Gustavo Frederico <
>>>>>> wrote:
>>>>> I undestand that the querying for PredictionIO is very fast, as if it
>>>>> were an Elasticsearch query. Also recall that the training moment is
>>>>> different moment that often takes a long time in most learning
>>>>> systems, but as long as it's not ridiculously long, it doesn't matter
>>>>> that much.
>>>>> Gustavo
>>>>> On Sun, Sep 25, 2016 at 2:30 AM, Georg Heiler <
>>>>>> wrote:
>>>>> > Hi predictionIO users,
>>>>> > I wonder what is the delay of an engine evaluating a model in
>>>>> > Are the models cached?
>>>>> >
>>>>> > Another project is generating PMML which can be
>>>>> evaluated
>>>>> > quickly from a production application.
>>>>> >
>>>>> > I believe, that very often the latency until the prediction happens,
>>>>> is
>>>>> > overlooked. How does predictionIO handle this topic?
>>>>> >
>>>>> > Best regards,
>>>>> > Georg

View raw message