predictionio-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kenneth Chan <kenn...@apache.org>
Subject Re: Saving predictions on training data with unsupervised learning
Date Sun, 05 Mar 2017 04:10:40 GMT
I guess your use case is not for real time label classify for unseen data?

 batch prediction is basically the same as batch eval.
see if this example helps?

http://predictionio.incubator.apache.org/templates/recommendation/batch-evaluator/




On Sat, Mar 4, 2017 at 11:56 AM Mars Hall <mars@heroku.com> wrote:

> Hi 🐸 folks,
>
> When using unsupervised learning algorithms (like K-Means) we need to save
> the predicted labels (cluster IDs) for the training data back into the
> datastore. Ideally, we want to automatically save bulk predictions for the
> training data after the model is created, when the RDD/DataFrame of all
> that data is already resident in Spark memory. It seems complex &
> inefficient to develop a whole separate process that (re)selects all that
> training data and then iteratively POSTs to `/queries.json` to get every
> prediction…
>
> Would adding a `bulk_save_predictions()` function to the persistent
> model's #save method might be the right place to save predictions back into
> the eventdata store?
>
> How do you folks label the training data from an unsupervised algorithm?
>
> Any suggestions for making bulk predictions that mesh with PredictionIO's
> workflow?
>
> *Mars Hall
> Customer Facing Architect
> Salesforce App Cloud / Heroku
> San Francisco, California
>
>

Mime
View raw message