Return-Path: Delivered-To: apmail-hadoop-avro-dev-archive@minotaur.apache.org Received: (qmail 80044 invoked from network); 8 Apr 2010 15:54:59 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 8 Apr 2010 15:54:59 -0000 Received: (qmail 92986 invoked by uid 500); 8 Apr 2010 15:54:59 -0000 Delivered-To: apmail-hadoop-avro-dev-archive@hadoop.apache.org Received: (qmail 92945 invoked by uid 500); 8 Apr 2010 15:54:58 -0000 Mailing-List: contact avro-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: avro-dev@hadoop.apache.org Delivered-To: mailing list avro-dev@hadoop.apache.org Received: (qmail 92937 invoked by uid 99); 8 Apr 2010 15:54:57 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 08 Apr 2010 15:54:57 +0000 X-ASF-Spam-Status: No, hits=1.7 required=10.0 tests=AWL,FREEMAIL_FROM,HTML_MESSAGE,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of bruce.mitchener@gmail.com designates 74.125.82.48 as permitted sender) Received: from [74.125.82.48] (HELO mail-ww0-f48.google.com) (74.125.82.48) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 08 Apr 2010 15:54:52 +0000 Received: by wwb39 with SMTP id 39so372577wwb.35 for ; Thu, 08 Apr 2010 08:54:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:received:message-id:subject:from:to:content-type; bh=0uIeII7RbUjUl9zNr0HDY2Z8DydxFnloKZ/yBxK8YbY=; b=Hiv6SZMry2N9FSV+fL0OitX9x4WUdmwj/LKGH6T626fKCcELvryF1rv/7CzbdoE9i7 XOkmBjaXFgzt9CwGHpGZ0FBnN5+B3DFlAcLBTZOyITO6TAFdRmWGTxV0xWpmOykwjPvw Ea9UvJ85OYw8gDkwqDbsrdqI94zAXZis6hZz8= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=dVouZhKWG9DlMfObwmFLN4a3330eOsqcMZd4ZLwsgsmN9WwjujAdG7Q0T8DC0I9p4z MmJmGqQ8jCH3QLFYUxa1vEKH0TB9Sw54Cbqy/L3eH0EjnmKj7FvXBrezIx0WS0Pa4eOe DqBDkIor7+iI5QebGSV3TXnz/q4oRiuKycM2g= MIME-Version: 1.0 Received: by 10.216.5.18 with HTTP; Thu, 8 Apr 2010 08:54:30 -0700 (PDT) In-Reply-To: References: Date: Thu, 8 Apr 2010 09:54:30 -0600 Received: by 10.216.174.129 with SMTP id x1mr138596wel.140.1270742070884; Thu, 08 Apr 2010 08:54:30 -0700 (PDT) Message-ID: Subject: Re: Thoughts on an RPC protocol From: Bruce Mitchener To: avro-dev@hadoop.apache.org Content-Type: multipart/alternative; boundary=0016e65ae682bc7d420483bbb15b --0016e65ae682bc7d420483bbb15b Content-Type: text/plain; charset=ISO-8859-1 While I recommend actually reading RFC 3080 (it is an easy read), this summary may help... Framing: Length prefixed data, nothing unusual. Encoding: Messages are effectively this: enum message_type { message, // a request reply, // when there's only a single reply answer, // when there are multiple replies, send multiple answers and then a null. null, // terminate a chain of replies error, // oops, there was an error } struct message { enum message_type message_type; int channel; int message_id; bool more; // Is this message complete, or is more data coming? for streaming int sequence_number; // see RFC 3080 optional int answer_number; // Used for answers bytes payload; // The actual RPC command, still serialized here } When a connection is opened, there's initially one channel, channel 0. That channel is used for commands controlling the connection state, like opening and closing channels. We should also perform Avro RPC handshakes over channel 0. Channels allow for concurrency. You can send requests/messages down multiple channels and process them independently. Messages on a single channel need to be processed in order though. This allows for both guaranteed order of execution (within a single channel) and greater concurrency (multiple channels). Streaming happens in 2 ways. The first way is to flip the more flag on a message. This means that the data has been broken up over multiple messages and you need to receive the whole thing before processing it. The second is to have multiple answers (followed by a null frame) to a single request message. This allows you to process the data in a streaming fashion. The only thing that this doesn't allow is to process the data being sent in a streaming fashion, but you could look at doing that by sending multiple request messages instead. Security and privacy can be handled by SASL. The RFC defines a number of ways in which you can detect buggy implementations of the protocol or invalid data being sent (framing / encoding violations). This should be pretty straight forward to implement, and as such (and since I need such a thing in the immediate future), I've already begun an implementation in C. - Bruce On Wed, Apr 7, 2010 at 4:13 PM, Bruce Mitchener wrote: > I'm assuming that the goals of an optimized transport for Avro RPC are > something like the following: > > * Framing should be efficient, easy to implement. > * Streaming of large values, both as part of a request and as a response > is very important. > * Being able to have multiple concurrent requests in flight, while also > being able to have ordering guarantees where desired is necessary. > * It should be easy to implement this in Java, C, Python, Ruby, etc. > * Security is or will be important. This security can include > authorization as well as privacy concerns. > > I'd like to see something based largely upon RFC 3080, with some > simplifications and extensions: > > http://www.faqs.org/rfcs/rfc3080.html > > What does this get us? > > * This system has mechanisms in place for streaming both a single large > message and breaking a single reply up into multiple answers, allowing for > pretty flexible streaming. (You can even mix these by having an answer that > gets chunked itself.) > * Concurrency is achieved by having multiple channels. Each channel > executes messages in order, so you have a good mechanism for sending > multiple things at once as well as maintaining ordering guarantees as > necessary. > * Reporting errors is very clear as it is a separate response type. > * It has already been specified pretty clearly and we'd just be evolving > that to something that more closely matches our needs. > * It specifies sufficient data that you could implement this over > transports other than TCP, such as UDP. > > Changes, rough list: > > * Use Avro-encoding for most things, so the encoding of a message would > become an Avro struct. > * Lose profiles in the sense that they're used in that specification since > we're just exchanging Avro RPCs. > * Do length prefixing rather than in the header, so that it is very > amenable to binary I/O at high volumes. > * No XML stuff, just existing things like the Avro handshake, wrapped up > in messages. > * For now, don't worry about things like flow control as expressed in RFC > 3081, mapping of 3080 to TCP. > * Think about adding something for true one-way messages, but an empty > reply frame is probably sufficient, since that still allows reporting errors > if needed (or desired). > * May well need some extensions for a more flexible security model. > * Use Avro RPC stuff to encode the channel management commands on channel > 0 rather than XML. > > RFC 3117 (http://www.faqs.org/rfcs/rfc3117.html) goes into some of the > philosophy and thinking behind the design of RFC 3080. Both are short and > easy reading. > > - Bruce > > --0016e65ae682bc7d420483bbb15b--