Return-Path: X-Original-To: apmail-spark-issues-archive@minotaur.apache.org Delivered-To: apmail-spark-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 4390118307 for ; Tue, 27 Oct 2015 09:07:28 +0000 (UTC) Received: (qmail 56356 invoked by uid 500); 27 Oct 2015 09:07:28 -0000 Delivered-To: apmail-spark-issues-archive@spark.apache.org Received: (qmail 56295 invoked by uid 500); 27 Oct 2015 09:07:28 -0000 Mailing-List: contact issues-help@spark.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@spark.apache.org Received: (qmail 56252 invoked by uid 99); 27 Oct 2015 09:07:28 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 27 Oct 2015 09:07:28 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id C8EFA2C1F62 for ; Tue, 27 Oct 2015 09:07:27 +0000 (UTC) Date: Tue, 27 Oct 2015 09:07:27 +0000 (UTC) From: "Apache Spark (JIRA)" To: issues@spark.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (SPARK-11343) Regression Imposes doubles on prediction/label columns MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/SPARK-11343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14976053#comment-14976053 ] Apache Spark commented on SPARK-11343: -------------------------------------- User 'dahlem' has created a pull request for this issue: https://github.com/apache/spark/pull/9296 > Regression Imposes doubles on prediction/label columns > ------------------------------------------------------ > > Key: SPARK-11343 > URL: https://issues.apache.org/jira/browse/SPARK-11343 > Project: Spark > Issue Type: Bug > Components: ML > Affects Versions: 1.5.1 > Environment: all environments > Reporter: Dominik Dahlem > > Using pyspark.ml and DataFrames, The ALS recommender cannot be evaluated using the RegressionEvaluator, because of a type mis-match between the model transformation and the evaluation APIs. One can work around this by casting the prediction column into double before passing it into the evaluator. However, this does not work with pipelines and cross validation. > Code and traceback below: > {code} > als = ALS(rank=10, maxIter=30, regParam=0.1, userCol='userID', itemCol='movieID', ratingCol='rating') > model = als.fit(training) > predictions = model.transform(validation) > evaluator = RegressionEvaluator(predictionCol='prediction', labelCol='rating') > validationRmse = evaluator.evaluate(predictions, {evaluator.metricName: 'rmse'}) > {code} > Traceback: > validationRmse = evaluator.evaluate(predictions, > {evaluator.metricName: 'rmse'} > ) > File "/Users/dominikdahlem/software/spark-1.6.0-SNAPSHOT-bin-custom-spark/python/lib/pyspark.zip/pyspark/ml/evaluation.py", line 63, in evaluate > File "/Users/dominikdahlem/software/spark-1.6.0-SNAPSHOT-bin-custom-spark/python/lib/pyspark.zip/pyspark/ml/evaluation.py", line 94, in _evaluate > File "/Users/dominikdahlem/software/spark-1.6.0-SNAPSHOT-bin-custom-spark/python/lib/py4j-0.9-src.zip/py4j/java_gateway.py", line 813, in _call_ > File "/Users/dominikdahlem/projects/repositories/spark/python/pyspark/sql/utils.py", line 42, in deco > raise IllegalArgumentException(s.split(': ', 1)[1]) > pyspark.sql.utils.IllegalArgumentException: requirement failed: Column prediction must be of type DoubleType but was actually FloatType. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org For additional commands, e-mail: issues-help@spark.apache.org