spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michal Laclavik (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-10115) MLlib ALS training fails with java.lang.ClassCastException
Date Fri, 21 Aug 2015 08:10:45 GMT

    [ https://issues.apache.org/jira/browse/SPARK-10115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14706381#comment-14706381
] 

Michal Laclavik commented on SPARK-10115:
-----------------------------------------

Yes, that is right, It was not obvious for me from the errors, but when I examined worker
logs I found also 
{code}
java.lang.ClassCastException: java.lang.Long cannot be cast to java.lang.Integer
{code}
Sorry for bothering you with this, but I did not realized it from those error messages at
first.


> MLlib ALS training fails with java.lang.ClassCastException
> ----------------------------------------------------------
>
>                 Key: SPARK-10115
>                 URL: https://issues.apache.org/jira/browse/SPARK-10115
>             Project: Spark
>          Issue Type: Bug
>         Environment: first experienced on spark 1.2.1 but then also with latest
> spark-1.4.1-bin-hadoop2.6
>            Reporter: Michal Laclavik
>
> I am running ALS collaborative filtering training on data which looks as follows (sample
by running "user_product.take(10)"):
> {code}
> [(1205640308657491975, 50233468418, 1.0),
>  (4743366459073625989, 50233472294, 1.0),
>  (4743366459073625989, 50233473253, 1.0),
>  (4743366459073625989, 75586230246, 1.0),
>  (4743366459073625989, 50233473248, 1.0),
>  (56766162624422850, 74848929776, 1.0),
>  (56766162624422850, 50233473397, 1.0),
>  (56766162624422850, 78185852309, 1.0),
>  (56766162624422850, 73533710263, 1.0),
>  (56766162624422850, 78185852319, 1.0)]
> {code}
> and then I call training on that RDD:
> {code}
> rank = 12
> iterations=5
> model = ALS.train(user_product, rank, iterations)
> {code}
> and I get following error:
> {code}
> ---------------------------------------------------------------------------
> Py4JJavaError                             Traceback (most recent call last)
> <ipython-input-54-4e711b94952d> in <module>()
>       2 rank = 12
>       3 iterations=5
> ----> 4 model = ALS.train(user_product, rank, iterations)
> /opt/spark/python/pyspark/mllib/recommendation.py in train(cls, ratings, rank, iterations,
lambda_, blocks, nonnegative, seed)
>     192               seed=None):
>     193         model = callMLlibFunc("trainALSModel", cls._prepare(ratings), rank, iterations,
> --> 194                               lambda_, blocks, nonnegative, seed)
>     195         return MatrixFactorizationModel(model)
>     196 
> /opt/spark/python/pyspark/mllib/common.py in callMLlibFunc(name, *args)
>     126     sc = SparkContext._active_spark_context
>     127     api = getattr(sc._jvm.PythonMLLibAPI(), name)
> --> 128     return callJavaFunc(sc, api, *args)
>     129 
>     130 
> /opt/spark/python/pyspark/mllib/common.py in callJavaFunc(sc, func, *args)
>     119     """ Call Java Function """
>     120     args = [_py2java(sc, a) for a in args]
> --> 121     return _java2py(sc, func(*args))
>     122 
>     123 
> /opt/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/java_gateway.py in __call__(self, *args)
>     536         answer = self.gateway_client.send_command(command)
>     537         return_value = get_return_value(answer, self.gateway_client,
> --> 538                 self.target_id, self.name)
>     539 
>     540         for temp_arg in temp_args:
> /opt/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/protocol.py in get_return_value(answer,
gateway_client, target_id, name)
>     298                 raise Py4JJavaError(
>     299                     'An error occurred while calling {0}{1}{2}.\n'.
> --> 300                     format(target_id, '.', name), value)
>     301             else:
>     302                 raise Py4JError(
> Py4JJavaError: An error occurred while calling o448.trainALSModel.
> : org.apache.spark.SparkException: Job aborted due to stage failure: Task 9 in stage
57.0 failed 1 times, most recent failure: Lost task 9.0 in stage 57.0 (TID 4187, localhost):
java.lang.ClassCastException
> Driver stacktrace:
> 	at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1273)
> 	at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1264)
> 	at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1263)
> 	at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
> 	at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
> 	at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1263)
> 	at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:730)
> 	at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:730)
> 	at scala.Option.foreach(Option.scala:236)
> 	at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:730)
> 	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1457)
> 	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1418)
> 	at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message