predictionio-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Georg Heiler <georg.kf.hei...@gmail.com>
Subject Re: delay of engines
Date Tue, 27 Sep 2016 07:04:42 GMT
For me, the latency of model evaluation is more important than training
latency. This holds true for retraining / model updates as well. I would
say that the "evaluation / prediction" latency is the most critical one.

Your point regarding 3) is very interesting for me. I have 2 types of data:

   - low volume information about a customer
   - high volume usage data

The high volume data will require aggregation (e.g. spark SQL) prior the
model can be evaluated. Here, a higher latency would be OK.
Regarding the low volume data: some features will require some sort of SQL
for extraction.



Kenneth Chan <kenneth@apache.org> schrieb am Di., 27. Sep. 2016 um
07:43 Uhr:

> re: kappa vs lambda.
> as far as i understand, at high-level, kappa is more like a subset of
> lambda (ie. only keep the real-time part)
>
> https://www.ericsson.com/research-blog/data-knowledge/data-processing-architectures-lambda-and-kappa/
>
> Gerog, would you be more specific when you talk about "latency requirement"
>
> 1. latency of training a model with new data?
> 2. latency of deploy new model ? or
> 3. latency of getting predicted result using the previously trained model
> given a query?
>
> if you are talking about 3, depending on how your model calculates the
> prediction. It doesn't need spark if the model can be fit into memory.
>
>
>
>
> On Mon, Sep 26, 2016 at 9:41 PM, Georg Heiler <georg.kf.heiler@gmail.com>
> wrote:
>
>> Hi Donald
>> For me it is more about stacking and meta learning. The selection of
>> models could be performed offline.
>>
>> But
>> 1 I am concerned about keeping the model up to date - retraining
>> 2 having some sort of reinforcement learning to improve / punish based on
>> correctness of new ground truth 1/month
>> 3 to have Very quick responses. Especially more like an evaluation of a
>> random forest /gbt / nnet without staring a yearn job.
>>
>> Thank you all for the feedback so far
>> Best regards to
>> Georg
>> Donald Szeto <donald@apache.org> schrieb am Di. 27. Sep. 2016 um 06:34:
>>
>>> Sorry for side-tracking. I think Kappa architecture is a promising
>>> paradigm, but including batch processing from the canonical store to the
>>> serving layer store should still be necessary. I believe this somewhat
>>> hybrid Kappa-Lambda architecture would be generic enough to handle many use
>>> cases. If this is something that sounds good to everyone, we should drive
>>> PredictionIO to that direction.
>>>
>>> Georg, are you talking about updating an existing model in different
>>> ways, evaluate them, and select one within a time constraint, say every 1
>>> second?
>>>
>>> On Mon, Sep 26, 2016 at 4:11 PM, Pat Ferrel <pat@occamsmachete.com>
>>> wrote:
>>>
>>>> If you need the model updated in realtime you are talking about a kappa
>>>> architecture and PredictionIO does not support that. It does Lambda only.
>>>>
>>>> The MLlib-based recommenders use live contexts to serve from in-memory
>>>> copies of the ALS models but the models themselves were calculated in the
>>>> background. There are several scaling issues with doing this but it can be
>>>> done.
>>>>
>>>> On Sep 25, 2016, at 10:23 AM, Georg Heiler <georg.kf.heiler@gmail.com>
>>>> wrote:
>>>>
>>>> Wow thanks. This is a great explanation.
>>>>
>>>> So when I think about writing a spark template for fraud detection (a
>>>> combination of spark sql and xgboost ) and would require <1 second latency
>>>> how should I store the model?
>>>>
>>>> As far as I know startup of YARN jobs e.g. A spark job is too slow for
>>>> that.
>>>> So it would be great if the model could be evaluated without using the
>>>> cluster or at least having a hot spark context similar to spark jobserver
>>>> or SnappyData.io <http://snappydata.io> is this possible for
>>>> prediction.io?
>>>>
>>>> Regards,
>>>> Georg
>>>> Pat Ferrel <pat@occamsmachete.com> schrieb am So. 25. Sep. 2016 um
>>>> 18:19:
>>>>
>>>>> Gustavo it correct. To put another way both Oryx and PredictionIO are
>>>>> based on what is called a Lambda Architecture. Loosely speaking this
means
>>>>> a potentially  slow background task computes the predictive “model”
but
>>>>> this does not interfere with serving queries. Then when the model is
ready
>>>>> (stored in HDFS or Elasticsearch depending on the template) it is deployed
>>>>> and the switch happens in microseconds.
>>>>>
>>>>> In the case of the Universal Recommender the model is stored in
>>>>> Elasticsearch. During `pio train` the new model in inserted into
>>>>> Elasticsearch and indexed. Once the indexing is done the index alias
used
>>>>> to serve queries is switched to the new index in one atomic action so
there
>>>>> is no downtime and any slow operation happens in the background without
>>>>> impeding queries.
>>>>>
>>>>> The answer will vary somewhat with the template. Templates that use
>>>>> HDFS for storage may need to be re-deployed but still the switch from
using
>>>>> one to having the new one running is microseconds.
>>>>>
>>>>> PMML is not relevant to this above discussion and is anyway useless
>>>>> for many model types including recommenders. If you look carefully at
how
>>>>> that is implementing in Oryx you will see that the PMML “models”
for
>>>>> recommenders are not actually stored as PMML, only a minimal description
of
>>>>> where the real data is stored are in PMML. Remember that it has all the
>>>>> problems of XML including no good way to read in parallel.
>>>>>
>>>>>
>>>>> On Sep 25, 2016, at 7:47 AM, Gustavo Frederico <
>>>>> gustavo.frederico@thinkwrap.com> wrote:
>>>>>
>>>>> I undestand that the querying for PredictionIO is very fast, as if it
>>>>> were an Elasticsearch query. Also recall that the training moment is
a
>>>>> different moment that often takes a long time in most learning
>>>>> systems, but as long as it's not ridiculously long, it doesn't matter
>>>>> that much.
>>>>>
>>>>> Gustavo
>>>>>
>>>>> On Sun, Sep 25, 2016 at 2:30 AM, Georg Heiler <
>>>>> georg.kf.heiler@gmail.com> wrote:
>>>>> > Hi predictionIO users,
>>>>> > I wonder what is the delay of an engine evaluating a model in
>>>>> prediction.io.
>>>>> > Are the models cached?
>>>>> >
>>>>> > Another project http://oryx.io/ is generating PMML which can be
>>>>> evaluated
>>>>> > quickly from a production application.
>>>>> >
>>>>> > I believe, that very often the latency until the prediction happens,
>>>>> is
>>>>> > overlooked. How does predictionIO handle this topic?
>>>>> >
>>>>> > Best regards,
>>>>> > Georg
>>>>>
>>>>>
>>>>
>>>
>

Mime
View raw message