mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Pat Ferrel (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAHOUT-1786) Make classes implements Serializable for Spark 1.5+
Date Mon, 19 Dec 2016 16:36:58 GMT

    [ https://issues.apache.org/jira/browse/MAHOUT-1786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15761631#comment-15761631
] 

Pat Ferrel commented on MAHOUT-1786:
------------------------------------

It sounds like we could remove Kryo altogether and improve performance by using the new Spark
serializer. It also sounds like this uses the more standard extending serializable, which
is built into many Scala classes IIRC.

Removing Kryo with a performance gains seems a big win. Kryo causes many config problems for
new users.

> Make classes implements Serializable for Spark 1.5+
> ---------------------------------------------------
>
>                 Key: MAHOUT-1786
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-1786
>             Project: Mahout
>          Issue Type: Improvement
>          Components: Math
>    Affects Versions: 0.11.0
>            Reporter: Michel Lemay
>            Priority: Minor
>              Labels: performance
>
> Spark 1.5 comes with a new very efficient serializer that uses code generation.  It is
twice as fast as kryo.  When using mahout, we have to set KryoSerializer because some classes
aren't serializable otherwise.  
> I suggest to declare Math classes as "implements Serializable" where needed.  For instance,
to use coocurence package in spark 1.5, we had to modify AbstractMatrix, AbstractVector, DenseVector
and SparseRowMatrix to make it work without Kryo.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message