avro-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kevin Oliver (JIRA)" <j...@apache.org>
Subject [jira] Commented: (AVRO-557) Speed up one-time data decoding
Date Wed, 02 Jun 2010 21:03:39 GMT

    [ https://issues.apache.org/jira/browse/AVRO-557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12874797#action_12874797

Kevin Oliver commented on AVRO-557:

Thanks Scott.

Unfortunately, I don't see any improvement with that. Here's a revised 1.3 test:

  private static class GenericReaderOneTimeUsage13Test extends GenericReaderOneTimeUsageTest
    private final DecoderFactory factory;
    protected GenericReaderOneTimeUsage13Test() throws IOException {
      factory = new DecoderFactory().configureDecoderBufferSize(256);
    @Override protected DatumReader<Object> getReader() {
      return new GenericDatumReader<Object>(writerSchema);
    @Override protected Decoder getDecoder() {
      return factory.createBinaryDecoder(data, null);

GenericReaderOneTimeUsage12Test: 2108 ms, 1.9756520386699568 million entries/sec.  0.00924642139783131
million bytes/sec
GenericReaderOneTimeUsage13Test: 13197 ms, 0.31569739270247654 million entries/sec.  0.0014775228987635406
million bytes/sec

I also tried having the factory configured using direct decoders and a small buffer and the
results were about the same:
GenericReaderOneTimeUsage13Test: 12738 ms, 0.32707876157061233 million entries/sec.  0.0015307898357438956
million bytes/sec

As far as storing them in ThreadLocals, I have not tested that internally and I'm not in love
with that approach. What about adding another option onto DecoderFactory for one-time use?

> Speed up one-time data decoding
> -------------------------------
>                 Key: AVRO-557
>                 URL: https://issues.apache.org/jira/browse/AVRO-557
>             Project: Avro
>          Issue Type: Improvement
>          Components: java
>    Affects Versions: 1.3.2
>            Reporter: Kevin Oliver
>            Assignee: Kevin Oliver
>             Fix For: 1.4.0
>         Attachments: AVRO-557.patch
> There are big gains to be had in performance when using a BinaryDecoder and a GenericDatumReader
just one time. This is due to the relatively expensive parsing and initialization that came
with 1.3. Patch with example code and a Perf harness to follow.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message