predictionio-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pat Ferrel <...@occamsmachete.com>
Subject Re: Dose v.011 support Spark ML, DataFrame and Pipeline
Date Thu, 20 Apr 2017 17:55:42 GMT
No, the API for Access to data in the EventServer does not require dataframes and so does not
use them but you can easily convert into one if you need it. As to SparkML, use whatever you
need in your algorithm. There are no restrictions as long as you build PIO for Spark 2 and
include whatever libs you need in your Template’s build.sbt. 

I maintain The Universal Recommender, which uses Mahout on Spark, not MLlib. It also does
not use Spark for deployed query serving, which is typical of many Templates. So there is
room to use your own architecture as long as it fits the general patterns.

https://github.com/actionml/universal-recommender <https://github.com/actionml/universal-recommender>


On Apr 19, 2017, at 11:28 PM, Fangzhou Yang <fangzhou.yang@hotmail.com> wrote:

Thanks for the reply. 

As I understand, the template algorithm uses PAlgorithm interface from PIO, which are using
RDD instead of DataFrame. Can I also implement a template algorithm with SparkML and DataFrame?
Is there any guide online? 

@Pat Ferrel <mailto:pat@occamsmachete.com> Is the template that you maintaining on the
github? If yes, could you provide the link?

Many Thanks,
Fangzhou 
From: Pat Ferrel <pat@occamsmachete.com>
Sent: Wednesday, April 19, 2017 10:37:08 PM
To: user@predictionio.incubator.apache.org
Subject: Re: Dose v.011 support Spark ML, DataFrame and Pipeline
 
There is no restriction in templates for what they use of Spark. The ones you are looking
at simply don’t need those interfaces. If you need them and are writing templates you can
use them. In fact I maintain a template that does not use Spark for the Algorithm, only for
IO.

If you think some new API should be in the default PIO API which would that be?


On Apr 19, 2017, at 12:06 AM, Fangzhou Yang <fangzhou.yang@hotmail.com <mailto:fangzhou.yang@hotmail.com>>
wrote:

Hi all,


I'm new to predictionio. I just noticed that v0.11 can already support Spark 2.x, can it also
currently support Spark ML, DataFrame and Pipeline. It seems the Algorithm interfaces support
only Spark RDD. If SparkML is not supported for now, will it be on the roadmap? Are there
anyone already work on it?  

Many Thanks,
Fangzhou


Mime
View raw message