flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-1966) Add support for predictive model markup language (PMML)
Date Thu, 08 Oct 2015 10:33:27 GMT

    [ https://issues.apache.org/jira/browse/FLINK-1966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14948453#comment-14948453
] 

ASF GitHub Bot commented on FLINK-1966:
---------------------------------------

Github user chiwanpark commented on a diff in the pull request:

    https://github.com/apache/flink/pull/1186#discussion_r41498327
  
    --- Diff: flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/regression/MultipleLinearRegression.scala
---
    @@ -124,6 +121,52 @@ class MultipleLinearRegression extends Predictor[MultipleLinearRegression]
{
         }
     
       }
    +
    +  override def toPMML(): PMML = {
    +    weightsOption match {
    +      case None => {
    +        throw new RuntimeException("The MultipleLinearRegression has not been fitted
to the " +
    +          "data. This is necessary to learn the weight vector of the linear function.")
    +      }
    +      case Some(weights) => {
    +        val model = weights.collect().head
    +        val pmml = new PMML()
    +        pmml.setHeader(new Header().setDescription("Multiple Linear Regression"))
    +
    +        // define the fields
    +        val target = FieldName.create("prediction")
    +        val fields = scala.Array.ofDim[FieldName](model.weights.size)
    +        Range(0, model.weights.size).foreach(index =>
    +          fields(index) = FieldName.create("field_" + index)
    +        )
    +
    +        // define the data dictionary, mining schema and regression table
    +        val dictionary = new DataDictionary()
    +        val miningSchema = new MiningSchema()
    +        val regressionTable = new RegressionTable().setIntercept(model.intercept)
    +        Range(0, model.weights.size).foreach(index => {
    +          miningSchema.addMiningFields(
    +            new MiningField(fields(index)).setUsageType(FieldUsageType.ACTIVE)
    +          )
    +          regressionTable.addNumericPredictors(
    +            new NumericPredictor(fields(index), model.weights(index))
    +          )
    +          dictionary.addDataFields(
    +            new DataField(fields(index), OpType.CONTINUOUS, DataType.DOUBLE)
    +          )
    +        })
    +        dictionary.addDataFields(new DataField(target, OpType.CONTINUOUS, DataType.DOUBLE))
    +        miningSchema.addMiningFields(new MiningField(target).setUsageType(FieldUsageType.PREDICTED))
    +
    +        // define the model
    +        val pmmlModel = new RegressionModel()
    +          .setFunctionName(MiningFunctionType.REGRESSION)
    --- End diff --
    
    Maybe we should add `.setModelType(RegressionModel.ModelType.LINEAR_REGRESSION)` after
this line for future of other regression model.


> Add support for predictive model markup language (PMML)
> -------------------------------------------------------
>
>                 Key: FLINK-1966
>                 URL: https://issues.apache.org/jira/browse/FLINK-1966
>             Project: Flink
>          Issue Type: Improvement
>          Components: Machine Learning Library
>            Reporter: Till Rohrmann
>            Assignee: Sachin Goel
>            Priority: Minor
>              Labels: ML
>
> The predictive model markup language (PMML) [1] is a widely used language to describe
predictive and descriptive models as well as pre- and post-processing steps. That way it allows
and easy way to export for and import models from other ML tools.
> Resources:
> [1] http://journal.r-project.org/archive/2009-1/RJournal_2009-1_Guazzelli+et+al.pdf



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message