crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gabriel Reid (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CRUNCH-173) Make WritableTypeFamily more compact for composite types
Date Sun, 10 Mar 2013 19:49:13 GMT

    [ https://issues.apache.org/jira/browse/CRUNCH-173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13598360#comment-13598360
] 

Gabriel Reid commented on CRUNCH-173:
-------------------------------------

I haven't had a chance to take a good look at the patch yet, but I'm definitely in favour
of the idea behind this once. 

I feel that it's relatively acceptable to change the serialization format of composite types
in Crunch -- I think that composite types are more typically used in intermediate processing,
and not so much for long-term storage (at least that's my experience). The 30% speedup also
provides a lot of extra motivation for it as well, although as Matthias mentioned it would
definitely be interesting to get more details on situation where you got the 30% speedup.
                
> Make WritableTypeFamily more compact for composite types
> --------------------------------------------------------
>
>                 Key: CRUNCH-173
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-173
>             Project: Crunch
>          Issue Type: Bug
>          Components: Core
>            Reporter: Josh Wills
>            Assignee: Josh Wills
>         Attachments: CRUNCH-173.patch
>
>
> I'm throwing this out as something of a strawman JIRA: it's always bugged me how verbose
the serialization of TupleWritable et al. are compared to the Avro formats, so I took a crack
at changing their underlying serialization to be more compact by doing more things in terms
of BytesWritable and using the wrapping MapFns in order to do more of the de-serialization
work. Patch is attached, if anyone is interested in this or has an opinion on whether or not
this is a good idea, I'd love to hear it. The big pro is that Crunch jobs that have to use
writables will run faster as a result, the downside is that it's not backwards compatible
and it makes the code more complex.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message