avro-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] Commented: (AVRO-258) Higher-level language for authoring schemata
Date Fri, 18 Dec 2009 20:23:18 GMT

    [ https://issues.apache.org/jira/browse/AVRO-258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12792626#action_12792626

Todd Lipcon commented on AVRO-258:

bq. First, I don't think we want to make such a tool a part of the spec

Fair enough - I'm ambivalent there.

bq. Perl or Python might thus be preferable to Java. 

I looked at some Python based parsers, but the issue is that many of them rely on libraries
rather than code generation. Many of those libraries are GPL or LGPL license, and also aren't
available on CentOS/RHEL 5, which means that in a lot of ways it's _less_ deployable than
Java. Pyparsing, which I like a lot and have used before, is a friendly license but still
has the library requirement, and would still have to bundled with the script. Having recently
worked on some python software that bundles a lot of library dependencies, it's a huge huge
_huge_ pain. :)

I actually almost did this in C/C++ with straight lex/yacc, but went towards Java since it
was easier for a quick first pass. Moving to C in the long run would be fine by me for the
reasons you outlined.

bq. Another approach, rather than trying to make the syntax more Java-like, implementing a
full parser, is to just remove the most annoying things from JSON... more complex JSON transformations
... etc

So, maybe I'm misunderstanding you, but it seems like you're proposing either (a) writing
a custom JSON parser that has some extensions to make the syntax more palatable, or (b) writing
a text-based preprocessor that outputs JSON which is then fed into the parser. Solution (a)
seems to me like it has all the same difficulties as writing our own language, but with a
less familiar syntax. Solution (b) seems hackish, and has the downside that it inherits the
syntactic strangeness of using JSON while not getting the benefits of using a standard language
(editor support, preexisting familiarity, etc).

bq. Beyond that, it starts to become lisp-versus-algol, unresolvable and a tremendous time

I'm not convinced that implementing our own language is really that tough. In about 3 hours
of work I got the above stuff done, and I'd never used JavaCC before. As for the religious
lisp-versus-algol question, I think it's already been resolved in the sense that most existing
protocol/data description languages are more algol-like than JSON-like (eg xdr, CORBA IDL,
protobufs, Apache Thrift, Apache Etch). The counterexamples are things like WSDL which no
one seems to really like.

To reiterate, I'm definitely _not_ suggesting than JSON be supplanted as the definitive schema
definition language for AVRO. It's great in that there are existing parsers in most languages
and readily machine-readable.

> Higher-level language for authoring schemata
> --------------------------------------------
>                 Key: AVRO-258
>                 URL: https://issues.apache.org/jira/browse/AVRO-258
>             Project: Avro
>          Issue Type: New Feature
>          Components: spec
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
> Early users of Avro have noted that authoring schemas and especially protocols in JSON
feels unnatural. This JIRA is to work on a higher-level language that feels more like defining
interfaces and classes in Java/C/etc.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message