spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Patrick Wendell (JIRA)" <>
Subject [jira] [Commented] (SPARK-6066) Metadata in event log makes it very difficult for external libraries to parse event log
Date Sun, 01 Mar 2015 01:42:04 GMT


Patrick Wendell commented on SPARK-6066:

[~vanzin] - yes you are right (an early scratch version of the feature used a Gzip stream,
I think). There are python bindings for all three of those compression codecs. To be fair,
I'm not 100% sure the codecs are standardized enough to be compatible across different implementations.
Gzip is pretty good in this regard, but not sure about those other three.

> Metadata in event log makes it very difficult for external libraries to parse event log
> ---------------------------------------------------------------------------------------
>                 Key: SPARK-6066
>                 URL:
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 1.3.0
>            Reporter: Kay Ousterhout
>            Assignee: Andrew Or
>            Priority: Blocker
> The fix for SPARK-2261 added a line at the beginning of the event log that encodes metadata.
 This line makes it much more difficult to parse the event logs from external libraries (like, which is used by folks at Berkeley) because:
> (1) The metadata is not written as JSON, unlike the rest of the file
> (2) More annoyingly, if the file is compressed, the metadata is not compressed.  This
has a few side-effects: first, someone can't just use the command line to uncompress the file
and then look at the logs, because the file is in this weird half-compressed format; and second,
now external tools that parse these logs also need to deal with this weird format.
> We should fix this before the 1.3 release, because otherwise we'll have to add a bunch
more backward-compatibility code to handle this weird format!

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message