spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ganelin, Ilya" <Ilya.Gane...@capitalone.com>
Subject Re: MLLib in Production
Date Wed, 10 Dec 2014 21:34:46 GMT
Hi all – I’ve been storing the model userFeatures and productFeatures vectors that are
generated internally serialized on disk and importing them as a separate job.

From: Sonal Goyal <sonalgoyal4@gmail.com<mailto:sonalgoyal4@gmail.com>>
Date: Wednesday, December 10, 2014 at 5:31 AM
To: Yanbo Liang <yanbohappy@gmail.com<mailto:yanbohappy@gmail.com>>
Cc: Simon Chan <simonchan@gmail.com<mailto:simonchan@gmail.com>>, Klausen Schaefersinho
<klaus.schaefers@gmail.com<mailto:klaus.schaefers@gmail.com>>, "user@spark.apache.org<mailto:user@spark.apache.org>"
<user@spark.apache.org<mailto:user@spark.apache.org>>
Subject: Re: MLLib in Production

You can also serialize the model and use it in other places.

Best Regards,
Sonal
Founder, Nube Technologies<http://www.nubetech.co>

<http://in.linkedin.com/in/sonalgoyal>



On Wed, Dec 10, 2014 at 5:32 PM, Yanbo Liang <yanbohappy@gmail.com<mailto:yanbohappy@gmail.com>>
wrote:
Hi Klaus,

There is no ideal method but some workaround.
Train model in Spark cluster or YARN cluster, then use RDD.saveAsTextFile to store this model
which include weights and intercept to HDFS.
Load weights file and intercept file from HDFS, construct a GLM model, and then run model.predict()
method, you can get what you want.

The Spark community also have some ongoing work about export model with PMML.

2014-12-10 18:32 GMT+08:00 Simon Chan <simonchan@gmail.com<mailto:simonchan@gmail.com>>:
Hi Klaus,

PredictionIO is an open source product based on Spark MLlib for exactly this purpose.
This is the tutorial for classification in particular: http://docs.prediction.io/classification/quickstart/

You can add custom serving logics and retrieve prediction result through REST API/SDKs at
other places.

Simon


On Wed, Dec 10, 2014 at 2:25 AM, Klausen Schaefersinho <klaus.schaefers@gmail.com<mailto:klaus.schaefers@gmail.com>>
wrote:
Hi,


I would like to use Spark to train a model, but use the model in some other place,, e.g. a
servelt to do some classification in real time.

What is the best way to do this? Can I just copy I model file or something and load it in
the servelt? Can anybody point me to a good tutorial?


Cheers,


Klaus



--
“Overfitting” is not about an excessive amount of physical exercise...



________________________________________________________

The information contained in this e-mail is confidential and/or proprietary to Capital One
and/or its affiliates. The information transmitted herewith is intended only for use by the
individual or entity to which it is addressed.  If the reader of this message is not the intended
recipient, you are hereby notified that any review, retransmission, dissemination, distribution,
copying or other use of, or taking of any action in reliance upon this information is strictly
prohibited. If you have received this communication in error, please contact the sender and
delete the material from your computer.

Mime
View raw message