spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nick Pentreath (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-21926) Some transformers in spark.ml.feature fail when trying to transform steaming dataframes
Date Wed, 06 Sep 2017 06:55:00 GMT

    [ https://issues.apache.org/jira/browse/SPARK-21926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16154901#comment-16154901
] 

Nick Pentreath commented on SPARK-21926:
----------------------------------------

For #2, (a) is definitely the correct solution.

> Some transformers in spark.ml.feature fail when trying to transform steaming dataframes
> ---------------------------------------------------------------------------------------
>
>                 Key: SPARK-21926
>                 URL: https://issues.apache.org/jira/browse/SPARK-21926
>             Project: Spark
>          Issue Type: Bug
>          Components: ML, Structured Streaming
>    Affects Versions: 2.2.0
>            Reporter: Bago Amirbekian
>
> We've run into a few cases where ML components don't play nice with streaming dataframes
(for prediction). This ticket is meant to help aggregate these known cases in one place and
provide a place to discuss possible fixes.
> Failing cases:
> 1) VectorAssembler where one of the inputs is a VectorUDT column with no metadata.
> Possible fixes:
> a) Re-design vectorUDT metadata to support missing metadata for some elements. (This
might be a good thing to do anyways SPARK-19141)
> b) drop metadata in streaming context.
> 2) OneHotEncoder where the input is a column with no metadata.
> Possible fixes:
> a) Make OneHotEncoder an estimator (SPARK-13030).
> b) Allow user to set the cardinality of OneHotEncoder.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message