spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nick Pentreath (JIRA)" <>
Subject [jira] [Commented] (SPARK-8418) Add single- and multi-value support to ML Transformers
Date Mon, 30 Oct 2017 07:43:00 GMT


Nick Pentreath commented on SPARK-8418:

Adding SPARK-13030, since the new version of {{OneHotEncoder}} will also support transforming
multiple columns.

> Add single- and multi-value support to ML Transformers
> ------------------------------------------------------
>                 Key: SPARK-8418
>                 URL:
>             Project: Spark
>          Issue Type: Sub-task
>          Components: ML
>            Reporter: Joseph K. Bradley
> It would be convenient if all feature transformers supported transforming columns of
single values and multiple values, specifically:
> * one column with one value (e.g., type {{Double}})
> * one column with multiple values (e.g., {{Array[Double]}} or {{Vector}})
> We could go as far as supporting multiple columns, but that may not be necessary since
VectorAssembler could be used to handle that.
> Estimators under {{ml.feature}} should also support this.
> This will likely require a short design doc to describe:
> * how input and output columns will be specified
> * schema validation
> * code sharing to reduce duplication

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message