spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tree Field (JIRA)" <>
Subject [jira] [Commented] (SPARK-6634) Allow replacing columns in Transformers
Date Fri, 10 Mar 2017 05:03:37 GMT


Tree Field commented on SPARK-6634:

I want this feature too.
because I often overwrite UnaryTransformer by myself  to enable this.

It seems it's only prevented in transformSchema method.
Now, unlike before v1.4,  dataframe's withColumn method used in UnaryTransformer allows replacing
 the input column.

Any other reasons that is not allowed in transoformer, especially in UnaryTransformer.

> Allow replacing columns in Transformers
> ---------------------------------------
>                 Key: SPARK-6634
>                 URL:
>             Project: Spark
>          Issue Type: Improvement
>          Components: ML
>    Affects Versions: 1.3.0
>            Reporter: Joseph K. Bradley
>            Priority: Minor
> Currently, Transformers do not allow input and output columns to share the same name.
 (In fact, this is not allowed but also not even checked.)
> Short-term proposal: Disallow input and output columns with the same name, and add a
check in transformSchema.
> Long-term proposal: Allow input & output columns with the same name, and where the
behavior is that the output columns replace input columns with the same name.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message