crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matthias Friedrich (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CRUNCH-173) Make WritableTypeFamily more compact for composite types
Date Sun, 10 Mar 2013 09:09:13 GMT

    [ https://issues.apache.org/jira/browse/CRUNCH-173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13598195#comment-13598195
] 

Matthias Friedrich commented on CRUNCH-173:
-------------------------------------------

Do people use Crunch's Writables for long-term storage? I've seen my share of problems due
to serialization changes, but 30% performance is pretty cool. How large is your test cluster?
From my experience, more compact serialization formats (through compression or otherwise)
don't help much on smaller clusters (10-20 machines) on a reasonably fast network in most
use cases.
                
> Make WritableTypeFamily more compact for composite types
> --------------------------------------------------------
>
>                 Key: CRUNCH-173
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-173
>             Project: Crunch
>          Issue Type: Bug
>          Components: Core
>            Reporter: Josh Wills
>            Assignee: Josh Wills
>         Attachments: CRUNCH-173.patch
>
>
> I'm throwing this out as something of a strawman JIRA: it's always bugged me how verbose
the serialization of TupleWritable et al. are compared to the Avro formats, so I took a crack
at changing their underlying serialization to be more compact by doing more things in terms
of BytesWritable and using the wrapping MapFns in order to do more of the de-serialization
work. Patch is attached, if anyone is interested in this or has an opinion on whether or not
this is a good idea, I'd love to hear it. The big pro is that Crunch jobs that have to use
writables will run faster as a result, the downside is that it's not backwards compatible
and it makes the code more complex.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message