mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAHOUT-1655) Refactor module dependencies
Date Sun, 29 Mar 2015 19:30:53 GMT

    [ https://issues.apache.org/jira/browse/MAHOUT-1655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14385902#comment-14385902
] 

ASF GitHub Bot commented on MAHOUT-1655:
----------------------------------------

GitHub user pferrel opened a pull request:

    https://github.com/apache/mahout/pull/86

    MAHOUT-1655

    Refactors mr-legacy into mahout-hdfs and mahout-mr
    
    Compiles and completes unit tests, can launch spark-shell but doesn't run spark-itemsimilarity
with the following error:
    
    15/03/29 12:22:12 INFO AkkaUtils: Connecting to HeartbeatReceiver: akka.tcp://sparkDriver@192.168.0.7:52857/user/HeartbeatReceiver
    15/03/29 12:22:12 WARN BlockManager: Putting block broadcast_0 failed
    Exception in thread "main" java.lang.NoSuchMethodError: com.google.common.hash.HashFunction.hashInt(I)Lcom/google/common/hash/HashCode;
    	at org.apache.spark.util.collection.OpenHashSet.org$apache$spark$util$collection$OpenHashSet$$hashcode(OpenHashSet.scala:261)
    	at org.apache.spark.util.collection.OpenHashSet$mcI$sp.getPos$mcI$sp(OpenHashSet.scala:165)
    	at org.apache.spark.util.collection.OpenHashSet$mcI$sp.contains$mcI$sp(OpenHashSet.scala:102)
    	at org.apache.spark.util.SizeEstimator$$anonfun$visitArray$2.apply$mcVI$sp(SizeEstimator.scala:214)
    	at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141)
    	at org.apache.spark.util.SizeEstimator$.visitArray(SizeEstimator.scala:210)
    	at org.apache.spark.util.SizeEstimator$.visitSingleObject(SizeEstimator.scala:169)
    	at org.apache.spark.util.SizeEstimator$.org$apache$spark$util$SizeEstimator$$estimate(SizeEstimator.scala:161)
    	at org.apache.spark.util.SizeEstimator$.estimate(SizeEstimator.scala:155)
    	at org.apache.spark.util.collection.SizeTracker$class.takeSample(SizeTracker.scala:78)
    	at org.apache.spark.util.collection.SizeTracker$class.afterUpdate(SizeTracker.scala:70)
    	at org.apache.spark.util.collection.SizeTrackingVector.$plus$eq(SizeTrackingVector.scala:31)
    	at org.apache.spark.storage.MemoryStore.unrollSafely(MemoryStore.scala:236)
    	at org.apache.spark.storage.MemoryStore.putIterator(MemoryStore.scala:126)
    	at org.apache.spark.storage.MemoryStore.putIterator(MemoryStore.scala:104)


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/pferrel/mahout MAHOUT-1655

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/mahout/pull/86.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #86
    
----
commit c783c0a91a5f6f1f279b77de6f52ccb39292b5e9
Author: Andrew Musselman <akm@apache.org>
Date:   2015-03-26T00:38:56Z

    Moving mrlegacy directory to hdfs, starting minimal mr directory, repointing references
to mrlegacy everywhere.

commit 1dc3662ac2f548d6ecc784aababda806aa5e7578
Author: Andrew Musselman <akm@apache.org>
Date:   2015-03-26T14:24:44Z

    Merge branch 'master' of https://git-wip-us.apache.org/repos/asf/mahout into MAHOUT-1655

commit 943d982f9ab08c1a9478ac40b02c4938091ec095
Author: Andrew Musselman <akm@apache.org>
Date:   2015-03-26T17:00:12Z

    Merge branch 'master' into MAHOUT-1655

commit 1c65f2f441e4f504cdb6d215397097a3296eac15
Author: Andrew Musselman <akm@apache.org>
Date:   2015-03-26T17:13:17Z

    Moving contents of hdfs over to mr.

commit 5c8e964991c88813a4cb8abe367245c3c7838246
Author: pferrel <pat@occamsmachete.com>
Date:   2015-03-29T16:44:02Z

    Merge branch 'MAHOUT-1655' of https://github.com/andrewmusselman/mahout into MAHOUT-1655

commit 2d940a04ad06cdeca23731355ef61f1d25d9d2d5
Author: pferrel <pat@occamsmachete.com>
Date:   2015-03-29T18:12:21Z

    moved classes into mahout-hdfs and created a dependency in mahout-mr for the module

commit 7cda0918c5bef50f2e32b70c3439fc3656d804f8
Author: pferrel <pat@occamsmachete.com>
Date:   2015-03-29T18:13:25Z

    added junits for moved classes

commit c2b18eebb9009bf0676566e8e9a30332441c2331
Author: pferrel <pat@occamsmachete.com>
Date:   2015-03-29T19:20:11Z

    no need to pass mapreduce jar to Spark context now

----


> Refactor module dependencies
> ----------------------------
>
>                 Key: MAHOUT-1655
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-1655
>             Project: Mahout
>          Issue Type: Improvement
>          Components: mrlegacy
>    Affects Versions: 0.9
>            Reporter: Pat Ferrel
>            Assignee: Andrew Musselman
>            Priority: Critical
>             Fix For: 0.10.0
>
>
> Make a new module, call it mahout-hadoop. Move anything there that is currently in mrlegacy
but used in math-scala or spark. Remove dependencies on mrlegacy altogether if possible by
using other core classes.
> The goal is to have math-scala and spark module depend on math, and a small module called
mahout-hadoop (much smaller than mrlegacy). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message