avro-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ey-chih chow <eyc...@hotmail.com>
Subject RE: avro object reuse
Date Thu, 02 Jun 2011 17:23:18 GMT

We create GenericData.Record a lot in our code via new GenericData.Record(schema).  Will this
generates Jackson calls?  Thanks.
Ey-Chih Chow

> From: scott@richrelevance.com
> To: user@avro.apache.org
> Date: Wed, 1 Jun 2011 18:48:15 -0700
> Subject: Re: avro object reuse
> One thing we do right now that might be related is the following:
> We keep Avro default Schema values as JsonNode objects. While traversing
> the JSON Avro schema representation using ObjectMapper.readTree() we
> remember JsonNodes that are "default" properties on fields and keep them
> on the Schema object.
> If these keep references to the parent (and the whole JSON tree, or worse,
> the ObjectMapper and input stream) it would be poor use of Jackson by us;
> although we'd need a way to keep a detached JsonNode or equivalent.
> However, even if that is the case (which it does not seem to be -- the
> jmap output has no JsonNode instances), it doesn't explain why we would be
> calling ObjectMapper frequently.  We only call
> ObjectMapper.readTree(JsonParser) when creating a Schema from JSON.  We
> call JsonNode methods from extracted fragments for everything else.
> This brings me to the following suspicion based on the data:
> Somewhere, Schema objects are being created frequently via one of the
> Schema.parse() or Protocol.parse() static methods.
> On 6/1/11 5:48 PM, "Tatu Saloranta" <tsaloranta@gmail.com> wrote:
> >On Wed, Jun 1, 2011 at 5:45 PM, Scott Carey <scott@richrelevance.com>
> >wrote:
> >> It would be useful to get a 'jmap -histo:live' report as well, which
> >>will
> >> only have items that remain after a full GC.
> >>
> >> However, a high churn of short lived Jackson objects is not expected
> >>here
> >> unless the user is reading Json serialized files and not Avro binary.
> >> Avro Data Files only contain binary encoded Avro content.
> >>
> >> It would be surprising to see many Jackson objects here if reading Avro
> >> Data Files, because we expect to use Jackson to parse an Avro schema
> >>from
> >> json only once or twice per file.  After the schema is parsed, Jackson
> >> shouldn't be used.   A hundred thousand DeserializationConfig instances
> >> means that isn't the case.
> >
> >Right -- it indicates that something (else) is using Jackson; and
> >there will typically be one instance of DeserializationConfig for each
> >data-binding call (ObjectMapper.readValue()), as a read-only copy is
> >made for operation.
> >... or if something is reading schema that many times, that sounds
> >like a problem in itself.
> >
> >-+ Tatu +-
View raw message