avro-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Doug Cutting (JIRA)" <j...@apache.org>
Subject [jira] Commented: (AVRO-251) add schema for schemas
Date Mon, 14 Dec 2009 20:27:39 GMT

    [ https://issues.apache.org/jira/browse/AVRO-251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12790333#action_12790333

Doug Cutting commented on AVRO-251:

> I think the usecase for it (I know you have one in mind, and we're hinting at it in this

I think you're referring to AVRO-160.  That's indeed what instigated this last week, but over
the weekend I've since had second thoughts about actually using it there.  I still believe
this is useful for some applications, but, if folks agree with me about AVRO-160, then committing
this is not urgent.

> one alternative is to ditch the whole binary representation and store the original schema
in Avro-encoded binary JSON.

I like that idea.  It'd be bigger than with the schema here, since all of the Avro keywords
will be included, but it will still be considerably smaller and faster than textual JSON.
 Plus the specification for JSON is much less likely to change, so its schema is likely to
be much more stable and hence its less risky to assume that schema as a system constant.

> I think you would agree that using either the specific (my preference) API or the generic
API would be clearer from a code perspective.

Perhaps a bit, but not much.  It adds an intermediate representation, which has some cognitive
overhead, which this code does not.  This code instead requires some understanding of Avro's
encoder/decoder API.  I don't think that would reduce the code size by more than perhaps 10%,
and I don't think it would be much more robust.  Efficiently mapping the union branch classes
to Schema subclasses would require something like a Map<Class,Schema.Type>.  This table
could be built by processing the schema, rather than as this patch does by assuming that the
Schema.Type enum is sync'd with the union.  But we could change this patch to build that mapping
from the schema too if we are particularly concerned about that.

I actually generated the specific code first, and considered writing it that way, but it felt
like more work to me.

> add schema for schemas
> ----------------------
>                 Key: AVRO-251
>                 URL: https://issues.apache.org/jira/browse/AVRO-251
>             Project: Avro
>          Issue Type: New Feature
>          Components: java
>            Reporter: Doug Cutting
>            Assignee: Doug Cutting
>             Fix For: 1.3.0
>         Attachments: AVRO-251.patch, AVRO-251.patch
> A schema for schemas would permits schemas to be written in binary.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message