avro-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Scott Carey (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (AVRO-911) remove object reuse from Java APIs
Date Thu, 06 Oct 2011 02:06:29 GMT

    [ https://issues.apache.org/jira/browse/AVRO-911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13121674#comment-13121674

Scott Carey commented on AVRO-911:

Object reuse is also hard in some of the work that I am doing (or... have not had time to
do in months) in AVRO-859.  Trying to apply object re-use to complicated object graphs is
not very beneficial.  Additionally making such object graphs as immutable as possible has
performance gains of its own and simplifies code.

In simple cases, re-use can have big gains.  These mostly boil down to avoiding boxing of
small primitives.  Here, you go from allocating something to allocating nothing.
For Utf8, we have to copy out a byte[] from the stream, so the Utf8 object allocation is only
a small portion of the total allocated, unless it is an empty string and we were to re-use
an empty byte[].

Delaying or avoiding Utf8 <-> String conversion is very beneficial however.  I use Utf8
in many places now for this purpose.  
I support Avro removing object re-use for the general case.  Specializations for mutable boxed
primitives or even simply returning / accepting primitives are something we can add later.
The low level read and write should have options for dealing with String as well as Utf8.
 Higher level APIs can choose either (for example, one might have two different SpecificCompiler
templates, or switch the type based on an annotation in AvroIDL).

As far as EscapeAnalysis introducing object allocation elision, this won't affect most use
cases here.  It would if you create a new object, call a method on it, then throw it away
within the scope of a method or loop, and in a few slightly larger scopes.

> remove object reuse from Java APIs
> ----------------------------------
>                 Key: AVRO-911
>                 URL: https://issues.apache.org/jira/browse/AVRO-911
>             Project: Avro
>          Issue Type: Improvement
>          Components: java
>            Reporter: Doug Cutting
>            Assignee: Doug Cutting
>             Fix For: 1.6.0
>         Attachments: perf-reuse.patch
> Avro's Java APIs were designed to permit object reuse when reading with the assumption
that would provide performance advantages.  In particular, the old parameter in DatumReader<T>.read(T
old, Decoder), the Utf8 class, and the GenericArray.peek() method were all designed for this
purpose.  But I am unable to see significant performance improvements when objects are reused.
 I tried modifying Perf.java's GenericTest to reuse records, and its StringTest to not reuse
Utf8 instances and, in both cases, performance is not substantially altered.
> If we were to remove these then issues such as AVRO-803 would disappear.  Always using
java.lang.String instead of Utf8 would remove a lot of user confusion. 

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message