spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael Armbrust (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-12141) Use Jackson to serialize all events when writing event log
Date Thu, 19 May 2016 19:17:13 GMT

    [ https://issues.apache.org/jira/browse/SPARK-12141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15291971#comment-15291971
] 

Michael Armbrust commented on SPARK-12141:
------------------------------------------

My issue with the catch-all case that was added is its not obvious to people creating new
messages that reflection is going to be used on them and that there are compatibility issues
in play.  As a result, the serializer was actually calling methods on our events that were
causing side-effects to occur ({{source.getOffset}}), which was really surprising.  One way
to avoid surprises is to require manual serialization, but there are other things we can do.

I'm not strongly against automatic serialization, but I think we need some guidelines for
its use. Straw man:
 - use separate case classes instead of internal objects
 - a limited set of types that we support (I've seen jackson do weird things with collections
/ options)

Perhaps there needs to be a trait or something that you mix in that states, "I expect this
to be JSON serialized and I understand the compatibility rules".

> Use Jackson to serialize all events when writing event log
> ----------------------------------------------------------
>
>                 Key: SPARK-12141
>                 URL: https://issues.apache.org/jira/browse/SPARK-12141
>             Project: Spark
>          Issue Type: Task
>          Components: Spark Core
>            Reporter: Marcelo Vanzin
>
> SPARK-11206 added infrastructure to serialize events using Jackson, so that manual serialization
code is not needed anymore.
> We should write all events using that support, and remove all the manual serialization
code in {{JsonProtocol}}.
> Since the event log format is a semi-public API, I'm targeting this at 2.0. Also, we
can't remove the manual deserialization code, since we need to be able to read old event logs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message