avro-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Scott Carey (JIRA)" <j...@apache.org>
Subject [jira] Commented: (AVRO-557) Speed up one-time data decoding
Date Wed, 02 Jun 2010 20:45:40 GMT

    [ https://issues.apache.org/jira/browse/AVRO-557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12874789#action_12874789
] 

Scott Carey commented on AVRO-557:
----------------------------------

bq. 2) BinaryDecoders now created a bunch of arrays and got more complicated, again significantly
slowing down one time usage.

Either configure the factory to have a small default buffer size (256 byte  will be fine),
or use the DirectBinaryDecoder, which is basically the 1.2 decoder.  See 
DecoderFactory, and its configureDecoderBufferSize(), configureDirectDecoder() and createXXX
factory methods.
You can configure your own factory instance, the default factory does not allow you to configure
it.

The BinaryDecoder is optimized for the case where you are keeping one around and re-using
it.  If possible, keep them around.  You can use something like a ThreadLocal<BinaryDecoder>
to make this easier.  DirectBinaryDecoder will have smaller initialization costs, but will
be about 1.5x to 5x slower when streaming large amounts of data.

If you aren't reading from a stream, but are reading from a byte[]. the new decoder will work
without creating its own array as well if you use the createBinaryDecoder(byte[], .  .  .
) factory methods.  You can reuse a decoder and reset its state from these as well. 

> Speed up one-time data decoding
> -------------------------------
>
>                 Key: AVRO-557
>                 URL: https://issues.apache.org/jira/browse/AVRO-557
>             Project: Avro
>          Issue Type: Improvement
>          Components: java
>    Affects Versions: 1.3.2
>            Reporter: Kevin Oliver
>            Assignee: Kevin Oliver
>             Fix For: 1.4.0
>
>         Attachments: AVRO-557.patch
>
>
> There are big gains to be had in performance when using a BinaryDecoder and a GenericDatumReader
just one time. This is due to the relatively expensive parsing and initialization that came
with 1.3. Patch with example code and a Perf harness to follow.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message