avro-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Doug Cutting (JIRA)" <j...@apache.org>
Subject [jira] Updated: (AVRO-557) Speed up one-time data decoding
Date Tue, 10 Aug 2010 19:50:17 GMT

     [ https://issues.apache.org/jira/browse/AVRO-557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Doug Cutting updated AVRO-557:

    Attachment: AVRO-557.patch

The google-collections based version causes test-tools to hang running the tethering test,
a bit of multi-threaded Avro.  There could well be bugs in the tethering code that use of
google's concurrent hashmap somehow brings out, but I didn't take the time to debug that.

Google collections 1.0 is also incompatible with checkstyle, which needs google-collections
0.9.  So, until checkstyle 5.1 is available in a maven repo, if we use google collections,
then we need to use 0.9.

Here's a new version that doesn't rely on google collections for the above reasons.  The tethering
test passes and checkstyle works, but it's a bit slower:

GenericReaderOneTimeUsageTest: 2264 ms, 1.839572825273033 million entries/sec.  0.008609545204085958
million bytes/sec

I also tried using an equals hashmap, and things slow to around 3800 milliseconds.  Also,
folks should not directly compare my timings with Kevin's: my laptop seems about 35% slower
than whatever Kevin uses.  If that's right, this latest version should be a bit faster than

> Speed up one-time data decoding
> -------------------------------
>                 Key: AVRO-557
>                 URL: https://issues.apache.org/jira/browse/AVRO-557
>             Project: Avro
>          Issue Type: Improvement
>          Components: java
>    Affects Versions: 1.3.2
>            Reporter: Kevin Oliver
>            Assignee: Doug Cutting
>             Fix For: 1.4.0
>         Attachments: AVRO-557.patch, AVRO-557.patch, AVRO-557.patch, AVRO-557.patch
> There are big gains to be had in performance when using a BinaryDecoder and a GenericDatumReader
just one time. This is due to the relatively expensive parsing and initialization that came
with 1.3. Patch with example code and a Perf harness to follow.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message