avro-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Scott Carey (JIRA)" <j...@apache.org>
Subject [jira] Commented: (AVRO-753) Java: Improve BinaryEncoder Performance
Date Mon, 07 Feb 2011 18:22:57 GMT

    [ https://issues.apache.org/jira/browse/AVRO-753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12991501#comment-12991501
] 

Scott Carey commented on AVRO-753:
----------------------------------

I think that 1.4.1 is significantly faster for Specific/Generic decoding than 1.3.2 due to
AVRO-557.

What about the buffering issue?

Does it make sense to follow our Decoder pattern and have BinaryEncoder and DirectBinaryEncoder?
 Or follow the OutputStream naming convention and have BinaryEncoder (implements OutputStream)
and BufferedBinaryEncoder (implements BufferedOutputStream) ?

The former matches our decoder convention, but the latter will not introduce bugs to users
who don't call flush() properly now.
I'm leaning towards the latter, with careful javadoc and release notes. 

> Java:  Improve BinaryEncoder Performance
> ----------------------------------------
>
>                 Key: AVRO-753
>                 URL: https://issues.apache.org/jira/browse/AVRO-753
>             Project: Avro
>          Issue Type: Improvement
>          Components: java
>            Reporter: Scott Carey
>            Assignee: Scott Carey
>             Fix For: 1.5.0
>
>         Attachments: AVRO-753.v1.patch
>
>
> BinaryEncoder has not had a performance improvement pass like BinaryDecoder did.  It
still mostly writes directly to the underlying OutputStream which is not optimal for performance.
 I like to use a rule that if you are writing to an OutputStream or reading from an InputStream
in chunks smaller than 128 bytes, you have a performance problem.
> Measurements indicate that optimizing BinaryEncoder yields a 2.5x to 6x performance improvement.
 The process is significantly simpler than BinaryDecoder because 'pushing' is easier than
'pulling' -- and also because we do not need a 'direct' variant because BinaryEncoder already
buffers sometimes.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message