avro-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James Baldassari <jbaldass...@gmail.com>
Subject Re: Avro enhancement: asynchronous RPCs for Java clients
Date Thu, 02 Jun 2011 06:35:39 GMT
Thanks, Scott!  Patch posted to
https://issues.apache.org/jira/browse/AVRO-539, and it's up on ReviewBoard
here: https://reviews.apache.org/r/834/


On Thu, Jun 2, 2011 at 1:50 AM, Scott Carey <scott@richrelevance.com> wrote:

> Sounds great!
> Put a patch into:
> https://issues.apache.org/jira/browse/AVRO-539
> (We may rename that and modify the description to better describe your
> change; but it clearly fulfills the intention of the original JIRA)
> If you'd like you can also put it in ReviewBoard, with a link to that
> inside the ticket.  But until there is a patch uploaded in JIRA granting
> license to Apache it can't be considered for committing.
> We can discuss remaining details in the ticket.
> Thanks!
> On 6/1/11 10:22 PM, "James Baldassari" <jbaldassari@gmail.com> wrote:
> I just finished a second attempt at the asynchronous RPC implementation
> incorporating Philip's feedback and some other ideas that I had.  I think
> it's easiest to explain how it works with an example.  So here's a simple
> IDL and schema:
> IDL:
> protocol Calculator {
>   int add(int arg1, int arg2);
> }
> Schema:
> {"protocol":"Calculator","messages":{
>   "add":{
>     "request":[{"name":"arg1","type":"int"},{"name":"arg2","type":"int"}],
>     "response":"int"}}}
> No changes are required to the IDL or schema to enable async RPCs.  The
> Avro Java compiler will generate two interfaces instead of one.  The first
> interface, Calculator, contains the standard synchronous methods.  The
> second interface, CalculatorClient, extends Calculator and adds asynchronous
> methods for all two-way messages.  The reason why the async methods are
> separated out into a separate interface is that the responder/server side
> doesn't need to know (and shouldn't know) about the client-side async
> methods.  So the Responder/server implements Calculator, and the
> Requestor/client can either use Calculator or CalculatorClient to invoke the
> RPCs.  For reference, here is what the two generated interfaces look like
> (without the PROTOCOL field and package names):
> public interface Calculator {
>   int add(int arg1, int arg2) throws AvroRemoteException;
> }
> public interface CalculatorClient extends Calculator {
>   CallFuture<Integer> addAsync(int arg1, int arg2) throws IOException;
>   CallFuture<Integer> addAsync(int arg1, int arg2, Callback<Integer>
> callback) throws IOException;
> }
> The CalculatorClient interface is the only new component.  It has two
> methods for each message, one that takes a Callback and one that does not.
> Both methods return a CallFuture so that the client has the option of using
> either the Future or the Callback to obtain the result of the RPC.
> Future.get() blocks until the RPC is complete, and either returns the result
> or throws an exception if one occurred during the RPC.  The Callback
> interface has two methods, handleResult(T result) and handleError(Exception
> error).  One or the other is always called depending on whether the RPC was
> successful or an Exception was thrown.
> In addition to the compiler changes, I had to make some changes in the
> avro-ipc project to get the async plumbing to work correctly.  Most of these
> changes are in Requestor and NettyTransceiver.  As part of the changes I had
> to make to Requestor I ended up replacing a couple of large synchronized
> blocks with finer-grained critical sections protected by reentrant locks.  I
> think this change improved performance overall, at least in the case where
> multiple threads are using the same client.  I implemented a rudimentary
> performance test that spins up a bunch of threads, executes the same RPC
> (Simple.hello(String)) repeatedly for a fixed amount of time, and then
> calculates the average number of RPCs completed per second.  With Avro 1.5.1
> I got 7,450 RPCs/sec, and with my modified version of trunk I got 19,050
> RPCs/sec.  That was a very simple test, but if there is a standard benchmark
> that the Avro team uses I'd be happy to rerun my tests using that.
> So that's basically it.  All existing unit tests pass, and I wrote
> additional tests for all the new async functionality.  I've documented all
> public interfaces, and I think the changes are ready to be reviewed if any
> of the committers have time to take a look.  Should I post a patch
> somewhere?  AVRO-539?  ReviewBoard?
> -James
> On Tue, May 31, 2011 at 9:09 PM, James Baldassari <jbaldassari@gmail.com>wrote:
>> Thanks for the helpful feedback!
>> After thinking about this more, I agree that it would be cleaner and
>> simpler to remove the "async" keyword/property from the IDL and schema.
>> Instead I'll just generate the asynchronous companion methods for all
>> two-way messages.
>> Regarding the passing of RPC results and exceptions/errors back to the
>> client asynchronously, I'm not sure what the best approach is.  I had
>> considered the use of both the future pattern and the callback pattern, but
>> I think the callback pattern is more useful in general.  One option I had
>> considered was having the async methods accept a Callback parameter and also
>> return a Future so that the client could choose which pattern to use on a
>> case-by-case basis.  I think I may go with the combined callback/future
>> approach as it provides clients with the most flexibility.
>> For Futures, error handling is straightforward: the get() method either
>> returns the result of the callback or throws the exception if one occurred.
>> The correct solution for the callback approach is less obvious, but I'll
>> tell you the approach that I took, and I'm open to comments.  The Callback
>> interface has two methods, one for handling a successful RPC result, and one
>> for handling exceptions that occur during RPC execution.  For each RPC,
>> either one or the other will be called, but never both.  The methods look
>> like this:
>> void handleCallback(T result);
>> void handleError(Exception error);
>> I'll work on making the changes you suggested, and hopefully I'll have a
>> patch that's in decent shape by the end of the week.
>> -James
>> On Tue, May 31, 2011 at 4:40 PM, Philip Zeyliger <philip@cloudera.com>wrote:
>>> Hi James,
>>> I think this is a fine approach.  You're correct that the place to do it
>>> is in the code generator.  It's not obvious to me that it belongs in the
>>> schema, however, since you might have two different pieces of software that
>>> want to use the same RPCs differently.  Is there any harm in
>>> always generating both types?
>>> As for your API, you'll want to specify very explicitly what happens to
>>> the exceptions that an Avro RPC call may declare.  Future<V> is one
>>> mechanism.  As is your Callback<Foo> mechanism, if there's a way to get
>>> exceptional states.
>>> On Tue, May 31, 2011 at 12:08 AM, James Baldassari <
>>> jbaldassari@gmail.com> wrote:
>>>> Hi,
>>>> I recently started playing with Avro RPCs, and while it's great that
>>>> Avro can use Netty for asynchronous I/O, all client-facing Java RPC
>>>> interfaces are currently synchronous (as far as I can tell).  It would be
>>>> nice to have asynchronous RPCs that use callbacks to take full advantage
>>>> Netty's asynchronous features.  I found at least one other request for this
>>>> feature on the Avro list (http://bit.ly/iCD0ae), and I believe this
>>>> enhancement is already documented as AVRO-539.
>>>> I took it on as a weekend project to add async Java RPCs to Avro, and I
>>>> think I have it all working now including unit tests.  I'd love to
>>>> contribute this patch once I've gotten some feedback, cleaned up the
>>>> documentation, and written a few more tests.  I'll give a quick example
>>>> demonstrating this new functionality.  Consider the following IDL and its
>>>> associated schema which use the asynchronous feature:
>>>> IDL:
>>>> protocol Calculator {
>>>>   int add(int arg1, int arg2) async;
>>>> }
>>>> Schema:
>>>> {"protocol":"Calculator","messages":{
>>>>   "add":{
>>>> "request":[{"name":"arg1","type":"int"},{"name":"arg2","type":"int"}],
>>>>     "response":"int",
>>>>     "async":true}}}
>>>> When the "async" keyword/property is present in a message, the Avro Java
>>>> compiler generates two methods instead of one: the standard synchronous
>>>> method and an asynchronous companion method.  For the example I gave above,
>>>> the following interface would be generated (the static PROTOCOL field and
>>>> package names are omitted for brevity):
>>>> public interface Calculator {
>>>>   int add(int arg1, int arg2) throws AvroRemoteException;
>>>>   void addAsync(int arg1, int arg2, Callback<Integer> callback) throws
>>>> IOException;
>>>> }
>>>> The addAsync method returns immediately without blocking (except one
>>>> special case which I'll describe shortly).  The callback provided as the
>>>> last argument to addAsync will be invoked with the result of the RPC as soon
>>>> as the server returns it.  There are a couple of caveats.  The first RPC,
>>>> whether synchronous or asynchronous, must block until the client-server
>>>> handshake has been completed.  Subsequent async RPCs will never block.
>>>> Also, only the NettyTransceiver currently works with async RPCs; all other
>>>> transceivers throw UnsupportedOperationException.
>>>> So that's the basic overview of my changes.  Please let me know if there
>>>> are any comments or questions.  Is this something people are interested in?
>>>> If so, should I post the patch in AVRO-539 or somewhere else?
>>>> Thanks,
>>>> James

View raw message