crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Josh Wills (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (CRUNCH-173) Make WritableTypeFamily more compact for composite types
Date Mon, 04 Mar 2013 06:15:12 GMT

     [ https://issues.apache.org/jira/browse/CRUNCH-173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Josh Wills updated CRUNCH-173:
------------------------------

    Attachment: CRUNCH-173.patch

Here's what it looks like-- not the prettiest thing ever, but a good deal faster (30% or so)
on some of my test sets on the cluster, where I'm essentially trading off IO for CPU. The
difference on the unit/integration tests is pretty marginal since we don't write that much
data out.
                
> Make WritableTypeFamily more compact for composite types
> --------------------------------------------------------
>
>                 Key: CRUNCH-173
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-173
>             Project: Crunch
>          Issue Type: Bug
>          Components: Core
>            Reporter: Josh Wills
>            Assignee: Josh Wills
>         Attachments: CRUNCH-173.patch
>
>
> I'm throwing this out as something of a strawman JIRA: it's always bugged me how verbose
the serialization of TupleWritable et al. are compared to the Avro formats, so I took a crack
at changing their underlying serialization to be more compact by doing more things in terms
of BytesWritable and using the wrapping MapFns in order to do more of the de-serialization
work. Patch is attached, if anyone is interested in this or has an opinion on whether or not
this is a good idea, I'd love to hear it. The big pro is that Crunch jobs that have to use
writables will run faster as a result, the downside is that it's not backwards compatible
and it makes the code more complex.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message