spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Apache Spark (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-11343) Regression Imposes doubles on prediction/label columns
Date Tue, 27 Oct 2015 09:07:27 GMT

    [ https://issues.apache.org/jira/browse/SPARK-11343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14976053#comment-14976053
] 

Apache Spark commented on SPARK-11343:
--------------------------------------

User 'dahlem' has created a pull request for this issue:
https://github.com/apache/spark/pull/9296

> Regression Imposes doubles on prediction/label columns
> ------------------------------------------------------
>
>                 Key: SPARK-11343
>                 URL: https://issues.apache.org/jira/browse/SPARK-11343
>             Project: Spark
>          Issue Type: Bug
>          Components: ML
>    Affects Versions: 1.5.1
>         Environment: all environments
>            Reporter: Dominik Dahlem
>
> Using pyspark.ml and DataFrames, The ALS recommender cannot be evaluated using the RegressionEvaluator,
because of a type mis-match between the model transformation and the evaluation APIs. One
can work around this by casting the prediction column into double before passing it into the
evaluator. However, this does not work with pipelines and cross validation.
> Code and traceback below:
> {code}
> als = ALS(rank=10, maxIter=30, regParam=0.1, userCol='userID', itemCol='movieID', ratingCol='rating')
> model = als.fit(training)
> predictions = model.transform(validation)
> evaluator = RegressionEvaluator(predictionCol='prediction', labelCol='rating')
> validationRmse = evaluator.evaluate(predictions, {evaluator.metricName: 'rmse'})
> {code}
> Traceback:
> validationRmse = evaluator.evaluate(predictions,
> {evaluator.metricName: 'rmse'}
> )
> File "/Users/dominikdahlem/software/spark-1.6.0-SNAPSHOT-bin-custom-spark/python/lib/pyspark.zip/pyspark/ml/evaluation.py",
line 63, in evaluate
> File "/Users/dominikdahlem/software/spark-1.6.0-SNAPSHOT-bin-custom-spark/python/lib/pyspark.zip/pyspark/ml/evaluation.py",
line 94, in _evaluate
> File "/Users/dominikdahlem/software/spark-1.6.0-SNAPSHOT-bin-custom-spark/python/lib/py4j-0.9-src.zip/py4j/java_gateway.py",
line 813, in _call_
> File "/Users/dominikdahlem/projects/repositories/spark/python/pyspark/sql/utils.py",
line 42, in deco
> raise IllegalArgumentException(s.split(': ', 1)[1])
> pyspark.sql.utils.IllegalArgumentException: requirement failed: Column prediction must
be of type DoubleType but was actually FloatType.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message