qpid-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alan Conway <acon...@redhat.com>
Subject Re: proton Messenger error handling/recovery REQUEST FEEDBACK!
Date Tue, 09 Sep 2014 14:59:33 GMT
On Mon, 2014-09-08 at 19:07 +0100, Fraser Adams wrote:
> Messenger gurus seem to be keeping their heads down a bit.
> 
> Is it *really* just Alan and I who are interested to understand the 
> error handling/reconnection behaviour of Messenger?
> 
> Is anybody using it in "industrial strength" applications or is it just 
> being used in quick and dirty demos? Without error handling and 
> reconnection mechanisms I'm struggling to see how it can be used for the 
> former.
> 
> I can likely hack things and Alan also mentioned that he "cheats", but 
> I'd really like to know from people who really understand messenger how 
> to do it *properly*.
> 

I've been looking at this and error handling in Messenger is not just a
matter of fixing implementation, there are some pretty big API questions
to be answered about when and how you can report errors. Its not
unfixable but I'm starting to think about moving away from Messenger and
towards using the proton Engine API.

The original tradeoff was that engine is more complete and flexible but
harder to use, whereas Messenger is easy but not as complete/flexible.
However if you look at the toolkit & examples at
 https://github.com/grs/examples
it makes engine a lot more appealing. The idea is to provide blocks of
"normal default" behavior in a toolkit to get going quickly (and to keep
you going for many/most uses) but allow those to be modified or replaced
as things get more complex. The nice thing about this is that you know
you can peel back the toolkit if you need to and get full access to the
proton event machine, so anything proton knows you can react to.

If we can make the engine API approachable enough for general messaging
use (while keeping it powerful enough for integration use) then it might
make more sense to focus on doing that than on maintaining two different
APIs for proton.

Cheers,
Alan.

> Frase
> 
> 
> On 05/09/14 14:17, Alan Conway wrote:
> > On Thu, 2014-09-04 at 18:28 +0100, Fraser Adams wrote:
> >> On 03/09/14 23:29, Alan Conway wrote:
> >>> On Wed, 2014-09-03 at 20:05 +0100, Fraser Adams wrote:
> >>>> Hello,
> >>>> I've probably missed something, but I don't know how to reliably detect
> >>>> failures and reconnect.
> >>>>
> >>>> So if I sent to an address with a freshly stood up Messenger instance
> >>>> and the address can't be found things aren't too bad and I wind up with
> >>>> an ECONNREFUSED that I could do something with, however if I've been
> >>>> sending messages to a valid address then I kill off the consumer I see
a:
> >>>>
> >>>> [0x513380]:ERROR amqp:connection:framing-error connection aborted
> >>>> [0x513380]:ERROR[-2] connection aborted
> >>>>
> >>>> CONNECTION ERROR connection aborted (remote)
> >>>>
> >>>> The thing is that all of these are *internally* generated messages sent
> >>>> to the console via fprintf, so my *application* doesn't really know
> >>>> about them (though I could be crafty and interpose my own cheeky fprintf
> >>>> to intercept them). That doesn't quite sound like the desired behaviour
> >>>> for a robust system?
> >>>>
> >>>>
> >>>> Similarly should I actually trap an error what's the correct way to
> >>>> continue, as it happens currently my app carries on silently doing
> >>>> nothing useful and continuing to do so even when the peer restarts (so
> >>>> there is no magic internal reconnection logic as far as I can see).
> >>>>
> >>>> do I have to do a
> >>>> messenger.stop()
> >>>> messenger.start()
> >>>>
> >>>> cycle to get things going again, I'm guessing so, but I'll like to know
> >>>> what the "correct"/expected way to create Messenger code that is robust
> >>>> against remote failures, as far as I can see there are no examples of
> >>>> that sort of thing?
> >>> I've come up against similar problems, I think it's an area that needs
> >>> some work in Proton. Is anybody already working on/thinking about this
> >>> area?
> >>>
> >>> Cheers,
> >>> Alan.
> >>>
> >> I'd definitely like to know how others deal with this sort of thing.
> > I cheat. I've been using proton in dispatch system tests, I come up
> > against these issues when I start up some proton/dispatch network and
> > try to use it too quickly before things have settled down. I have some
> > tweaks in my test harness to wait till things are ready so there are no
> > errors :) That's not a solution for general non-test situations -
> > although knowing how to wait till things are ready is always useful.
> >
> > https://svn.apache.org/repos/asf/qpid/dispatch/trunk/tests/system_test.py
> >
> > class Messenger adds a "flush" method that pumps the Messenger event
> > loop till there is no more work to do. Otherwise subscribe() in
> > particular gives no way to tell when the subscription is active.
> >
> > Note: My situation is a bit special in that dispatch creates addresses
> > dynamically on subscribe and my tests involve slow stuff like waypoints
> > to brokers etc. That introduces a delay in subscribe that probably isn't
> > visible when the address is created beforehand.
> >
> > There's also Qpidd.wait_ready and Qdrouterd.wait_ready that wait for
> > qpidd and dispatch router to be ready respectively so I can be sure that
> > when I connect with proton they'll be listening. Those wait for the
> > expected listening ports to be connectable and in the case of dispatch
> > also does a qmf check to make sure that all expected outgoing connectors
> > are there. 		
> >
> >> For info notwithstanding not necessarily being able to trap all the
> >> errors without being devious around fprintf  (which to be fair works,
> >> but it's a bit sneaky and if you have multiple Messenger instances won't
> >> tell you which one the error relates to) but when I do get an error I
> >> appear to have to start from scratch - in other words:
> >>
> >> message.free();
> >> messenger.free();
> >> message = new proton.Message();
> >> messenger = new proton.Messenger();
> >> messenger.start();
> >>
> >> If I try to restart the original messenger or use existing queue I get
> >> no joy. It's not the end of the world but I've no idea what robust
> >> Messenger code is *supposed* to look like.
> >>
> >> Presumably Alan and I aren't the only people who might like to be able
> >> to trap errors and restart? Or does every one else write code that never
> >> fails ;->
> > I always wondered how everybody but me can do that. Sigh. For you and me
> > I think we need to do some work on proton's error handling.
> >
> > - proton (or any library!) should NEVER EVER write anything direct to
> > stdout or stderr. It needs a (very simple) logging facility that can
> > write to stderr by default but can be redirected elsewhere.
> > - proton should never log an error without also returning some useful
> > error condition to the application.
> >
> > Proton has some useful pn_error_* functions, they just need to be used
> > more widely. In dispatch I introduced an errno-style thread-local error
> > code/message (in proton it would be a pn_error_t*) That allows sensible
> > error messages out of functions that want to return something else (e.g.
> > pointer or null and set the thread error) It also allows you to work
> > around lazy error handling (temporarily of course (hahahaha)) - a caller
> > couple of stack frames up can detect an error even if intermediate
> > functions didn't check & propagate errors properly. I'm not advocating
> > lazy error checking but in C it is hard to get everything.
> >
> > FEEDBACK PLEASE: anyone think this is a great/horrible idea? Does proton
> > already do things I've missed that would make this unnecessary?
> >
> > Cheers,
> > Alan.
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org
> > For additional commands, e-mail: users-help@qpid.apache.org
> >
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org
> For additional commands, e-mail: users-help@qpid.apache.org
> 



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org
For additional commands, e-mail: users-help@qpid.apache.org


Mime
View raw message