avro-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Scott Carey (JIRA)" <j...@apache.org>
Subject [jira] Commented: (AVRO-769) Java: Align Decoder/Encoder APIs for consistency and long term stability
Date Tue, 22 Feb 2011 06:11:38 GMT

    [ https://issues.apache.org/jira/browse/AVRO-769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12997649#comment-12997649

Scott Carey commented on AVRO-769:

Some extra comments here:

I did not move in configuration items on Json, Validating, or Resolving decoder/encoders.
The intention is that Encoder/Decoder and the two factories will be stable.

Json : requires a Schema, and can't change its schema (currently), only its input/output.
 I left reconfiguring its output out of the factory and put it on the objects.  This could
be moved in, and would seem to be very stable.

Validating: requires a schema, and an input/output decoder/encoder.  We could expose the reconfiguration
of the encoder/decoder, but it wasn't in the old API, and I'm not sure how useful that is
versus just making a new one.

Resolving: This is where things get more tricky.  Not only does this require two schemas,
but it involves an API I'm not sure I can say will be stable on the resolver side.  Having
it swap out the underlying decoder/encoder is only sometimes used, and that use is questionable
as is.  So I did not want to make that part of this 'stable' api yet.   I did make one constructor
form private which is reduced functionality.
In the long run I want to get rid of this class and separate resolution of schema pairs from
decoders entirely.  The current scheme requires that a user traverse the schema AND use the
parser to get the job done which is inefficient and error prone.

So the summary is that although we could add more in the factories, I would rather leave them
tied to the basics as much as possible -- abstracting away binary encoder/decoder subtypes,
and providing construction facilities in general.  Configuration is only provided for the
binary use cases.  The binary encoder/decoder have to hide their configuration options better
because of the implementation types.   Only ResolvingDecoder has similar configuration complexity
and that should be dealt with if and when it changes.

> Java: Align Decoder/Encoder APIs for consistency and long term stability 
> -------------------------------------------------------------------------
>                 Key: AVRO-769
>                 URL: https://issues.apache.org/jira/browse/AVRO-769
>             Project: Avro
>          Issue Type: Improvement
>          Components: java
>            Reporter: Scott Carey
>            Assignee: Scott Carey
>            Priority: Blocker
>             Fix For: 1.5.0
>         Attachments: AVRO-769.v1.patch
> As part of AVRO-753, we modified the Encoder API to be more like the current Decoder
API.  This issue tracks related changes to solidify the API of both Encoder and Decoder to
be more stable and consistent.  It is expected that the result will be long-lived and not
require major changes in the future for the following reasons:
> * Instantiation and configuration will be funneled through EncoderFactory and DecoderFactory.
 Individual implementation types and constructors are not exposed.  With this abstraction
we could, for example, put the features of BlockingBinaryEncoder into BufferedBinaryEncoder
and not break any user code.  We already have some of this distinction on the Decoder side,
but not all BinaryDecoders are going through the factory.
> * The core Encoder and Decoder abstract classes will not declare configuration methods
or constructors. This makes them 'pure' low level Avro read/write API constructs.  This separation
of concerns means, for example, that not all encoder implementations need wrap an OutputStream
because of init(OutputStream out).
> * The core Encoder and Decoder API does not know or care about Schemas, resolution, or
any other 'higher order' Avro concept.  This is the pure separation of concern for writing/reading
primitive Avro types to/from somewhere.
> * Implementations have been heavily performance tuned on both sides, so changes to the
API necessary for high performance will not be likely.
> The Factories will adhere to the following general principles:
> * configuration options that do not affect the semantics of a type can be set through
the factory.  i.e. buffer sizes.  
> * configuration that affects the semantics or changes the output or supported input will
have separate factory methods.  For example, choosing between an implementation that requires
calling flush() and one that does not, requires choosing a different factory method to instantiate.
 This is important because it generally means that client code explicitly requests the behavioral
type, and that helps prevent bugs caused by accidentally configuring a factory to return an
object that is incompatible with the use case.

This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message