mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <>
Subject [jira] [Commented] (MAHOUT-1655) Refactor module dependencies
Date Sun, 29 Mar 2015 19:30:53 GMT


ASF GitHub Bot commented on MAHOUT-1655:

GitHub user pferrel opened a pull request:


    Refactors mr-legacy into mahout-hdfs and mahout-mr
    Compiles and completes unit tests, can launch spark-shell but doesn't run spark-itemsimilarity
with the following error:
    15/03/29 12:22:12 INFO AkkaUtils: Connecting to HeartbeatReceiver: akka.tcp://sparkDriver@
    15/03/29 12:22:12 WARN BlockManager: Putting block broadcast_0 failed
    Exception in thread "main" java.lang.NoSuchMethodError:;
    	at org.apache.spark.util.collection.OpenHashSet$mcI$sp.getPos$mcI$sp(OpenHashSet.scala:165)
    	at org.apache.spark.util.collection.OpenHashSet$mcI$sp.contains$mcI$sp(OpenHashSet.scala:102)
    	at org.apache.spark.util.SizeEstimator$$anonfun$visitArray$2.apply$mcVI$sp(SizeEstimator.scala:214)
    	at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141)
    	at org.apache.spark.util.SizeEstimator$.visitArray(SizeEstimator.scala:210)
    	at org.apache.spark.util.SizeEstimator$.visitSingleObject(SizeEstimator.scala:169)
    	at org.apache.spark.util.SizeEstimator$.org$apache$spark$util$SizeEstimator$$estimate(SizeEstimator.scala:161)
    	at org.apache.spark.util.SizeEstimator$.estimate(SizeEstimator.scala:155)
    	at org.apache.spark.util.collection.SizeTracker$class.takeSample(SizeTracker.scala:78)
    	at org.apache.spark.util.collection.SizeTracker$class.afterUpdate(SizeTracker.scala:70)
    	at org.apache.spark.util.collection.SizeTrackingVector.$plus$eq(SizeTrackingVector.scala:31)

You can merge this pull request into a Git repository by running:

    $ git pull MAHOUT-1655

Alternatively you can review and apply these changes as the patch at:

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #86
commit c783c0a91a5f6f1f279b77de6f52ccb39292b5e9
Author: Andrew Musselman <>
Date:   2015-03-26T00:38:56Z

    Moving mrlegacy directory to hdfs, starting minimal mr directory, repointing references
to mrlegacy everywhere.

commit 1dc3662ac2f548d6ecc784aababda806aa5e7578
Author: Andrew Musselman <>
Date:   2015-03-26T14:24:44Z

    Merge branch 'master' of into MAHOUT-1655

commit 943d982f9ab08c1a9478ac40b02c4938091ec095
Author: Andrew Musselman <>
Date:   2015-03-26T17:00:12Z

    Merge branch 'master' into MAHOUT-1655

commit 1c65f2f441e4f504cdb6d215397097a3296eac15
Author: Andrew Musselman <>
Date:   2015-03-26T17:13:17Z

    Moving contents of hdfs over to mr.

commit 5c8e964991c88813a4cb8abe367245c3c7838246
Author: pferrel <>
Date:   2015-03-29T16:44:02Z

    Merge branch 'MAHOUT-1655' of into MAHOUT-1655

commit 2d940a04ad06cdeca23731355ef61f1d25d9d2d5
Author: pferrel <>
Date:   2015-03-29T18:12:21Z

    moved classes into mahout-hdfs and created a dependency in mahout-mr for the module

commit 7cda0918c5bef50f2e32b70c3439fc3656d804f8
Author: pferrel <>
Date:   2015-03-29T18:13:25Z

    added junits for moved classes

commit c2b18eebb9009bf0676566e8e9a30332441c2331
Author: pferrel <>
Date:   2015-03-29T19:20:11Z

    no need to pass mapreduce jar to Spark context now


> Refactor module dependencies
> ----------------------------
>                 Key: MAHOUT-1655
>                 URL:
>             Project: Mahout
>          Issue Type: Improvement
>          Components: mrlegacy
>    Affects Versions: 0.9
>            Reporter: Pat Ferrel
>            Assignee: Andrew Musselman
>            Priority: Critical
>             Fix For: 0.10.0
> Make a new module, call it mahout-hadoop. Move anything there that is currently in mrlegacy
but used in math-scala or spark. Remove dependencies on mrlegacy altogether if possible by
using other core classes.
> The goal is to have math-scala and spark module depend on math, and a small module called
mahout-hadoop (much smaller than mrlegacy). 

This message was sent by Atlassian JIRA

View raw message