flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From chobeat <...@git.apache.org>
Subject [GitHub] flink pull request: [FLINK-1966][ml]Add support for Predictive Mod...
Date Mon, 08 Feb 2016 21:37:58 GMT
Github user chobeat commented on the pull request:

    Hi @chiwanpark,
    > What is main purpose to support PMML? Is this feature for only model portability
in FlinkML?
    I've used PMML extensively in a previous project and saw many application cases other
than my own. PMML export is necessary for  external portability: you may need to create a
model in Flink and use it on local data using a data mining tool for example, or you could
deploy it in a production pipeline developed with a totally different technological stack.

    PMML import is optional though: you can use JPMML (the reference implementation of PMML)
to read a PMML file and perform the evaluation of the model locally to the node. Import from
PMML to the native implementation of FlinkML may be a plus in terms of usability and probably
performance but it's not really a blocking issue for a developer.
    > If not, we have to support other systems such as R or Spark MLlib.
    Support for R may be interesting by itself but I can't understand what do you mean. MLlib
does support PMML export (even if somewhat bugged for a few models like Naive Bayes) so it
is already possible to move models from MLlib to Flink.
    >What about FlinkML only format? I think that support for distributed system in PMML
is poor. XML-based format is hard to parallelize.
    This could be interesting to guarantee the consistency of the models and to tune it to
our needs. The complexity of PMML is due to the need of generality and consistency but it's
often an overkill to describe simple models. Also it has only partial support for many models
that we may want to implement: i.e. any of the online learning algorithms implemented in SAMOA
or other online learning frameworks. I know we still miss a few pieces before reaching that
point, but still...

If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.

View raw message