avro-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Amichai Rothman (JIRA)" <j...@apache.org>
Subject [jira] Updated: (AVRO-438) spec organization and clarification improvements
Date Wed, 03 Mar 2010 16:26:27 GMT

     [ https://issues.apache.org/jira/browse/AVRO-438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Amichai Rothman updated AVRO-438:

    Status: Patch Available  (was: Open)

The patch fixes most of the issues. A few more thoughts:

- The block mechanism described for arrays and maps is basically copy&paste of a few paragraphs
- perhaps the map serialization can simply be described as an array where each item is a key
immediately followed by its respective value?

- I added the binary encoding to the file format, but not the rpc section, since I got more
confused there. In discussion of the HTTP transport, it says to use the "avro/binary" content
type, which suggests there might also be a "avro/json" version later on or something like
that. So maybe the serialization format is actually transport-dependent and not part of the
spec? Maybe there should be another section for the binary socket transport implementation?

- Further, AIUI "avro/binary" is not a legal HTTP content type. It should be something more
like "application/x-avro-binary"  (or registered with IANA). But this digresses into changes
in the spec itself, not just its wording. Should I open this as a separate bug? is it?

- As for the example, yes it would be mostly binary, but can be annotated to explain what
each bunch of bytes mean.

> spec organization and clarification improvements
> ------------------------------------------------
>                 Key: AVRO-438
>                 URL: https://issues.apache.org/jira/browse/AVRO-438
>             Project: Avro
>          Issue Type: Improvement
>          Components: spec
>    Affects Versions: 1.3.0
>            Reporter: Amichai Rothman
>            Priority: Trivial
>         Attachments: fix_spec_loose_ends.patch
> There are a few improvements that can be made to make the spec better organized and clarify
ambiguous meanings:
> 1. The binary encoding specifies string, then bytes, then longs. However, the first two
are dependent on the latter, so in essence long encoding is being used before it was defined.
In addition, string comes before bytes even though it is logically a special case of bytes.
It would be clearer if these were ordered long, bytes, string so that each definition builds
on its predecessors and nothing is used before it is defined. Maybe bytes/string should be
at the end of the other primitives, since they are technically more complex structures. Note
that it might be a good idea to do this in all places in the spec where primitives are enumerated.
> 2. The sentence about array count and size is a bit confusing. A possible alternative:
> "If a block's count is negative, its absolute value is used, and it is followed immediately
by a long  block size indicating the number of bytes in the block. "
> and maybe this should be immediately followed by the sentence explaining why this is
useful which is currently a few lines below.
> 3. There is a note about blocks being in experimental stage, but it's unclear if this
is only for map blocks or also for array blocks.
> 4. Object Container Files and Protocol Declarations are described in the spec using JSON
objects and their schema is shown, but it doesn't say anywhere how these should be serialized.
If it's using binary serialization, it should say so explicitly. If it can be either binary
or JSON, then the file has no self-describing way of differentiating the two - this should
be addressed somewhere (maybe have a different magic word for binary/JSON content).
> 5. Protocol Definition has a namespace and name (called protocol), but it is not clear
whether the namespace rules defined in the first section apply here or not. It should be mentioned
explicitly either way.
> 6.It would be extremely helpful to have a full sample of an RPC call over HTTP, possibly
using the HelloWorld protocol from the previous example. This would show how the transport,
framing, handshake, call format and messages all fit together. Examples in RFCs often help
clarify any misunderstandings that might arise from the body of the specs, which makes for
a better spec - and this would be great here too.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message