predictionio-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kenneth Chan <kenn...@apache.org>
Subject Re: delay of engines
Date Tue, 27 Sep 2016 07:22:49 GMT
just want to clarify when you mix ""evaluation / prediction"
1. "evaluation" means evaluating the performance of the model
2. "prediction" means calculate the prediction (fraud or not in your case)
understand that evaluation needs generating "prediction" in order to
evaluate the accuracy of the model.
But you are referring to low latency for 2 right?

"Regarding the low volume data: some features will require some sort of SQL
for extraction"
So if this can be fast, and if using the model to calculate the likelihood
of fraud can be done in memory (without RDD), then the latency should be
low.



On Tue, Sep 27, 2016 at 12:04 AM, Georg Heiler <georg.kf.heiler@gmail.com>
wrote:

> For me, the latency of model evaluation is more important than training
> latency. This holds true for retraining / model updates as well. I would
> say that the "evaluation / prediction" latency is the most critical one.
>
> Your point regarding 3) is very interesting for me. I have 2 types of data:
>
>    - low volume information about a customer
>    - high volume usage data
>
> The high volume data will require aggregation (e.g. spark SQL) prior the
> model can be evaluated. Here, a higher latency would be OK.
> Regarding the low volume data: some features will require some sort of SQL
> for extraction.
>
>
>
> Kenneth Chan <kenneth@apache.org> schrieb am Di., 27. Sep. 2016 um
> 07:43 Uhr:
>
>> re: kappa vs lambda.
>> as far as i understand, at high-level, kappa is more like a subset of
>> lambda (ie. only keep the real-time part)
>> https://www.ericsson.com/research-blog/data-knowledge/
>> data-processing-architectures-lambda-and-kappa/
>>
>> Gerog, would you be more specific when you talk about "latency
>> requirement"
>>
>> 1. latency of training a model with new data?
>> 2. latency of deploy new model ? or
>> 3. latency of getting predicted result using the previously trained model
>> given a query?
>>
>> if you are talking about 3, depending on how your model calculates the
>> prediction. It doesn't need spark if the model can be fit into memory.
>>
>>
>>
>>
>> On Mon, Sep 26, 2016 at 9:41 PM, Georg Heiler <georg.kf.heiler@gmail.com>
>> wrote:
>>
>>> Hi Donald
>>> For me it is more about stacking and meta learning. The selection of
>>> models could be performed offline.
>>>
>>> But
>>> 1 I am concerned about keeping the model up to date - retraining
>>> 2 having some sort of reinforcement learning to improve / punish based
>>> on correctness of new ground truth 1/month
>>> 3 to have Very quick responses. Especially more like an evaluation of a
>>> random forest /gbt / nnet without staring a yearn job.
>>>
>>> Thank you all for the feedback so far
>>> Best regards to
>>> Georg
>>> Donald Szeto <donald@apache.org> schrieb am Di. 27. Sep. 2016 um 06:34:
>>>
>>>> Sorry for side-tracking. I think Kappa architecture is a promising
>>>> paradigm, but including batch processing from the canonical store to the
>>>> serving layer store should still be necessary. I believe this somewhat
>>>> hybrid Kappa-Lambda architecture would be generic enough to handle many use
>>>> cases. If this is something that sounds good to everyone, we should drive
>>>> PredictionIO to that direction.
>>>>
>>>> Georg, are you talking about updating an existing model in different
>>>> ways, evaluate them, and select one within a time constraint, say every 1
>>>> second?
>>>>
>>>> On Mon, Sep 26, 2016 at 4:11 PM, Pat Ferrel <pat@occamsmachete.com>
>>>> wrote:
>>>>
>>>>> If you need the model updated in realtime you are talking about a
>>>>> kappa architecture and PredictionIO does not support that. It does Lambda
>>>>> only.
>>>>>
>>>>> The MLlib-based recommenders use live contexts to serve from in-memory
>>>>> copies of the ALS models but the models themselves were calculated in
the
>>>>> background. There are several scaling issues with doing this but it can
be
>>>>> done.
>>>>>
>>>>> On Sep 25, 2016, at 10:23 AM, Georg Heiler <georg.kf.heiler@gmail.com>
>>>>> wrote:
>>>>>
>>>>> Wow thanks. This is a great explanation.
>>>>>
>>>>> So when I think about writing a spark template for fraud detection (a
>>>>> combination of spark sql and xgboost ) and would require <1 second
latency
>>>>> how should I store the model?
>>>>>
>>>>> As far as I know startup of YARN jobs e.g. A spark job is too slow for
>>>>> that.
>>>>> So it would be great if the model could be evaluated without using the
>>>>> cluster or at least having a hot spark context similar to spark jobserver
>>>>> or SnappyData.io <http://snappydata.io> is this possible for
>>>>> prediction.io?
>>>>>
>>>>> Regards,
>>>>> Georg
>>>>> Pat Ferrel <pat@occamsmachete.com> schrieb am So. 25. Sep. 2016
um
>>>>> 18:19:
>>>>>
>>>>>> Gustavo it correct. To put another way both Oryx and PredictionIO
are
>>>>>> based on what is called a Lambda Architecture. Loosely speaking this
means
>>>>>> a potentially  slow background task computes the predictive “model”
but
>>>>>> this does not interfere with serving queries. Then when the model
is ready
>>>>>> (stored in HDFS or Elasticsearch depending on the template) it is
deployed
>>>>>> and the switch happens in microseconds.
>>>>>>
>>>>>> In the case of the Universal Recommender the model is stored in
>>>>>> Elasticsearch. During `pio train` the new model in inserted into
>>>>>> Elasticsearch and indexed. Once the indexing is done the index alias
used
>>>>>> to serve queries is switched to the new index in one atomic action
so there
>>>>>> is no downtime and any slow operation happens in the background without
>>>>>> impeding queries.
>>>>>>
>>>>>> The answer will vary somewhat with the template. Templates that use
>>>>>> HDFS for storage may need to be re-deployed but still the switch
from using
>>>>>> one to having the new one running is microseconds.
>>>>>>
>>>>>> PMML is not relevant to this above discussion and is anyway useless
>>>>>> for many model types including recommenders. If you look carefully
at how
>>>>>> that is implementing in Oryx you will see that the PMML “models”
for
>>>>>> recommenders are not actually stored as PMML, only a minimal description
of
>>>>>> where the real data is stored are in PMML. Remember that it has all
the
>>>>>> problems of XML including no good way to read in parallel.
>>>>>>
>>>>>>
>>>>>> On Sep 25, 2016, at 7:47 AM, Gustavo Frederico <
>>>>>> gustavo.frederico@thinkwrap.com> wrote:
>>>>>>
>>>>>> I undestand that the querying for PredictionIO is very fast, as if
it
>>>>>> were an Elasticsearch query. Also recall that the training moment
is a
>>>>>> different moment that often takes a long time in most learning
>>>>>> systems, but as long as it's not ridiculously long, it doesn't matter
>>>>>> that much.
>>>>>>
>>>>>> Gustavo
>>>>>>
>>>>>> On Sun, Sep 25, 2016 at 2:30 AM, Georg Heiler <
>>>>>> georg.kf.heiler@gmail.com> wrote:
>>>>>> > Hi predictionIO users,
>>>>>> > I wonder what is the delay of an engine evaluating a model in
>>>>>> prediction.io.
>>>>>> > Are the models cached?
>>>>>> >
>>>>>> > Another project http://oryx.io/ is generating PMML which can
be
>>>>>> evaluated
>>>>>> > quickly from a production application.
>>>>>> >
>>>>>> > I believe, that very often the latency until the prediction
>>>>>> happens, is
>>>>>> > overlooked. How does predictionIO handle this topic?
>>>>>> >
>>>>>> > Best regards,
>>>>>> > Georg
>>>>>>
>>>>>>
>>>>>
>>>>
>>

Mime
View raw message