avro-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matt Massie (JIRA)" <j...@apache.org>
Subject [jira] Commented: (AVRO-341) specify avro transport in spec
Date Tue, 09 Feb 2010 00:11:28 GMT

    [ https://issues.apache.org/jira/browse/AVRO-341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12831204#action_12831204

Matt Massie commented on AVRO-341:

I support the idea of having an Avro RPC specification that is written as much as possible
(completely?) in Avro schema.  This isn't just good design, it also prevents duplicating work.
I agree with Phil that we *don't* want...

bq. Instead of saying "and then there shall be a long, encoded like so, and then it shall
by follows by that many bytes",...

There are many good examples of RPC/serialization programs that describe RPC using IDLs. 
For example, the Internet Communications Engine (http://www.zeroc.com/doc/Ice-3.3.1/manual/Protocol.39.3.html)
describes their RPC protocol using ICE (their IDL).  SunRPC uses XDR to completely describe
RPC (http://www.faqs.org/rfcs/rfc1050.html).  There's even an RFC protocol script that pulls
all the XDR definitions from an RFC and writes them into a single protocol (.x) file to be
run using {{rpcgen}}.  1970s tech FTW!

Here is a straw man to make it a little clearer what I'm proposing here.

{"type": "record",
 "name": "rpc_message",
 "fields": [
    {"name": "xid", "type": "long"},
    {"name": "auth", "type": "bytes"},
    {"name": "body", [

       {"type": "record",
        "name": "rpc_call_message",
        "fields": [
          {"name": "rpcvers", "type": "long"},
          {"name": "service", "type": "long"},
          {"name": "version", "type": "long"},
          {"name": "method", "type": "long"},

       {"type": "record",
        "name": "rpc_response_success",
        "fields": [
          {"name": "results", "type": "bytes"}]},

       {"type": "record",
        "name": "rpc_version_mismatch",
        "field": [
          {"name": "low_version", "type": "long"},
          {"name": "high_version", "type": "long"}]},

       {"type": "record",
        "name": "rpc_service_unavailable",
        "field": [
          {"name": "reason", "type": "bytes"}]},

       {"type": "record",
        "name": "rpc_call_version_mismatch",
        "fields": [
          {"name": "low_version", "type": "long"},
          {"name": "high_version", "type": "long"}]},

       {"type": "record",
        "name": "rpc_auth_error",
        "field": [
          {"name": "reason", "type": "bytes"}]} ]}

This example is really just RFC 1050 wrapped up in Avro schema.  This schema isn't complete
but it's *explicit*.  For example, it says that an {{rpc_response_success}} message is nothing
but a bunch of bytes.  That's okay.  We can drill into the details of those opaque bytes in
a separate response schema definition.  This layering will give us flexibility in the future
and make it easier to break RPC into components.  For example, in this case, we could easily
create a base RPC proxy for clients that passes the response bytes "up" to a higher level
response processor.  The proxy only needs to know the base RPC schema and nothing more. 

This base is also very light.  You could {{CALL}} a remote method using as little as 6 bytes
sent over whatever transport you like e.g. UDP, TCP, SSL, TCP-over-DNS. Transports only deal
in *bytes* and could not care less about messages (although we may need to define record marking
like section 10 of RFC 1050).  

Btw, RPC is at the *session* level with the *presentation* to the *application* being Avro
serialization.  I agree that using the term Avro *transport* is confusing and makes me cry
elephant tears.

|| OSI Layer || Example ||
| Application | Avro Proxy Object |
| Presentation | Avro Binary Serialization |
| Session | Avro RPC state machine |
| Transport | TCP, UDP, etc |
| Network | IP |
| Data-link/Physical | Ethernet |

bq. DISCOVER: Asks the server for information about itself. 

It terms of vocabulary, I feel that *discovery* is more about finding all *machines* running
Avro services (like bonjour or Zeroconf).  The term *introspection* seems more appropriate

Aside from introspection, we also need a simple Avro "ping" service.  Using the base schema
above, we could have a convention that says

* All service numbers less than zero are reserved for Avro use (for discovery/introspection,
ping, etc)
* All service numbers greater than zero are for user-defined services.
* Service number zero is the ping service.

> specify avro transport in spec
> ------------------------------
>                 Key: AVRO-341
>                 URL: https://issues.apache.org/jira/browse/AVRO-341
>             Project: Avro
>          Issue Type: Improvement
>          Components: spec
>            Reporter: Doug Cutting
> We should develop a high-performance, secure, transport for Avro.  This should be named
with avro: uris.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message