spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Akos Tomasits (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (SPARK-22198) Java incompatibility when extending UnaryTransformer or Transformer
Date Fri, 13 Apr 2018 12:19:00 GMT

     [ https://issues.apache.org/jira/browse/SPARK-22198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Akos Tomasits updated SPARK-22198:
----------------------------------
    Description: 
It is not possible to create proper Java custom Transformer by extending UnaryTransformer
or Transformer.

The built-in Params (e.g. defined in HasInputColumn trait) cannot be used and custom Params
cannot be added.

It seems that the method 'uid()' is called during object creation before the provided 'uid'
constructor parameter could be set.

This leads to the following error:
{quote}java.lang.IllegalArgumentException: requirement failed: Param <prefix>_1563950936fa__inputCol
does not belong to <prefix>_d4105b75c4aa.
{quote}
If you extend UnaryTransformer and try to use it e.g. through CrossValidator, you will need
to explicitly include a constructor, which receives a String parameter. As I saw in the source
of built in transformers, this parameter is a 'uid', which should be set in the object. However,
it is not possible to do it in time, because the uid() method is invoked (and its result might
be used) before this constructor finishes.

Sample class:
{quote}public class TextCleaner extends UnaryTransformer<String, String, TextCleaner>
 implements Serializable, DefaultParamsWritable, DefaultParamsReadable<TextCleaner>
{

private static final long serialVersionUID = 2658543236303100458L;

private static final String sparkUidPrefix = "TextCleaner";

private final String sparkUid;

public TextCleaner() {
 sparkUid = org.apache.spark.ml.util.Identifiable$.MODULE$.randomUID(sparkUidPrefix);
 }

public TextCleaner(String uid) {
 sparkUid = uid;
 }

@Override
 public String uid() { // This method is called by parent class, before object creation finishes
 return sparkUid;
 }

...
{quote}

  was:
It is not possible to create proper Java custom Transformer by extending UnaryTransformer.

It seems that the method 'uid()' is called during object creation before the provided 'uid'
constructor parameter could be set.

This leads to the following error:

{quote}
 java.lang.IllegalArgumentException: requirement failed: Param <prefix>_1563950936fa__inputCol
does not belong to <prefix>_d4105b75c4aa.
{quote}

If you extend UnaryTransformer and try to use it e.g. through CrossValidator, you will need
to explicitly include a constructor, which receives a String parameter. As I saw in the source
of built in transformers, this parameter is a 'uid', which should be set in the object. However,
it is not possible to do it in time, because the uid() method is invoked (and its result might
be used) before this constructor finishes.

Sample class:

{quote}
public class TextCleaner extends UnaryTransformer<String, String, TextCleaner>
                implements Serializable, DefaultParamsWritable, DefaultParamsReadable<TextCleaner>
\{

    private static final long serialVersionUID = 2658543236303100458L;
    
    private static final String sparkUidPrefix = "TextCleaner";
    
    private final String sparkUid;

    public TextCleaner() \{
		sparkUid = org.apache.spark.ml.util.Identifiable$.MODULE$.randomUID(sparkUidPrefix);
	}

	public TextCleaner(String uid) \{
             sparkUid = uid;
        }
    
    @Override
    public String uid() \{ // This method is called by parent class, before object creation
finishes
          return sparkUid;
    }

    ...
{quote}



> Java incompatibility when extending UnaryTransformer or Transformer
> -------------------------------------------------------------------
>
>                 Key: SPARK-22198
>                 URL: https://issues.apache.org/jira/browse/SPARK-22198
>             Project: Spark
>          Issue Type: Bug
>          Components: Java API, ML
>    Affects Versions: 2.2.0
>            Reporter: Akos Tomasits
>            Priority: Major
>
> It is not possible to create proper Java custom Transformer by extending UnaryTransformer
or Transformer.
> The built-in Params (e.g. defined in HasInputColumn trait) cannot be used and custom
Params cannot be added.
> It seems that the method 'uid()' is called during object creation before the provided
'uid' constructor parameter could be set.
> This leads to the following error:
> {quote}java.lang.IllegalArgumentException: requirement failed: Param <prefix>_1563950936fa__inputCol
does not belong to <prefix>_d4105b75c4aa.
> {quote}
> If you extend UnaryTransformer and try to use it e.g. through CrossValidator, you will
need to explicitly include a constructor, which receives a String parameter. As I saw in the
source of built in transformers, this parameter is a 'uid', which should be set in the object.
However, it is not possible to do it in time, because the uid() method is invoked (and its
result might be used) before this constructor finishes.
> Sample class:
> {quote}public class TextCleaner extends UnaryTransformer<String, String, TextCleaner>
>  implements Serializable, DefaultParamsWritable, DefaultParamsReadable<TextCleaner>
{
> private static final long serialVersionUID = 2658543236303100458L;
> private static final String sparkUidPrefix = "TextCleaner";
> private final String sparkUid;
> public TextCleaner() {
>  sparkUid = org.apache.spark.ml.util.Identifiable$.MODULE$.randomUID(sparkUidPrefix);
>  }
> public TextCleaner(String uid) {
>  sparkUid = uid;
>  }
> @Override
>  public String uid() { // This method is called by parent class, before object creation
finishes
>  return sparkUid;
>  }
> ...
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message