spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alok Bhandari (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (SPARK-16473) BisectingKMeans Algorithm failing with java.util.NoSuchElementException: key not found
Date Thu, 27 Oct 2016 09:34:58 GMT

     [ https://issues.apache.org/jira/browse/SPARK-16473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Alok Bhandari updated SPARK-16473:
----------------------------------
    Affects Version/s: 2.0.0

> BisectingKMeans Algorithm failing with java.util.NoSuchElementException: key not found
> --------------------------------------------------------------------------------------
>
>                 Key: SPARK-16473
>                 URL: https://issues.apache.org/jira/browse/SPARK-16473
>             Project: Spark
>          Issue Type: Bug
>          Components: ML, MLlib
>    Affects Versions: 1.6.1, 2.0.0
>         Environment: AWS EC2 linux instance. 
>            Reporter: Alok Bhandari
>
> Hello , 
> I am using apache spark 1.6.1. 
> I am executing bisecting k means algorithm on a specific dataset .
> Dataset details :- 
> K=100,
> input vector =100K*100k
> Memory assigned 16GB per node ,
> number of nodes =2.
>  Till K=75 it os working fine , but when I set k=100 , it fails with java.util.NoSuchElementException:
key not found. 
> *I suspect it is failing because of lack of some resources , but somehow exception does
not convey anything as why this spark job failed.* 
> Please can someone point me to root cause of this exception , why it is failing. 
> This is the exception stack-trace:- 
> {code}
> java.util.NoSuchElementException: key not found: 166 
>         at scala.collection.MapLike$class.default(MapLike.scala:228) 
>         at scala.collection.AbstractMap.default(Map.scala:58) 
>         at scala.collection.MapLike$class.apply(MapLike.scala:141) 
>         at scala.collection.AbstractMap.apply(Map.scala:58) 
>         at org.apache.spark.mllib.clustering.BisectingKMeans$$anonfun$org$apache$spark$mllib$clustering$BisectingKMeans$$updateAssignments$1$$anonfun$2.apply$mcDJ$sp(BisectingKMeans.scala:338)
>         at org.apache.spark.mllib.clustering.BisectingKMeans$$anonfun$org$apache$spark$mllib$clustering$BisectingKMeans$$updateAssignments$1$$anonfun$2.apply(BisectingKMeans.scala:337)
>         at org.apache.spark.mllib.clustering.BisectingKMeans$$anonfun$org$apache$spark$mllib$clustering$BisectingKMeans$$updateAssignments$1$$anonfun$2.apply(BisectingKMeans.scala:337)
>         at scala.collection.TraversableOnce$$anonfun$minBy$1.apply(TraversableOnce.scala:231)

>         at scala.collection.LinearSeqOptimized$class.foldLeft(LinearSeqOptimized.scala:111)

>         at scala.collection.immutable.List.foldLeft(List.scala:84) 
>         at scala.collection.LinearSeqOptimized$class.reduceLeft(LinearSeqOptimized.scala:125)

>         at scala.collection.immutable.List.reduceLeft(List.scala:84) 
>         at scala.collection.TraversableOnce$class.minBy(TraversableOnce.scala:231) 
>         at scala.collection.AbstractTraversable.minBy(Traversable.scala:105) 
>         at org.apache.spark.mllib.clustering.BisectingKMeans$$anonfun$org$apache$spark$mllib$clustering$BisectingKMeans$$updateAssignments$1.apply(BisectingKMeans.scala:337)

>         at org.apache.spark.mllib.clustering.BisectingKMeans$$anonfun$org$apache$spark$mllib$clustering$BisectingKMeans$$updateAssignments$1.apply(BisectingKMeans.scala:334)

>         at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) 
>         at scala.collection.Iterator$$anon$14.hasNext(Iterator.scala:389) 
> {code}
> Issue is that , it is failing but not giving any explicit message as to why it failed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message