Return-Path: X-Original-To: apmail-river-dev-archive@www.apache.org Delivered-To: apmail-river-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 993B590DB for ; Tue, 10 Jul 2012 18:45:26 +0000 (UTC) Received: (qmail 87402 invoked by uid 500); 10 Jul 2012 18:45:26 -0000 Delivered-To: apmail-river-dev-archive@river.apache.org Received: (qmail 87375 invoked by uid 500); 10 Jul 2012 18:45:26 -0000 Mailing-List: contact dev-help@river.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@river.apache.org Delivered-To: mailing list dev@river.apache.org Received: (qmail 87366 invoked by uid 99); 10 Jul 2012 18:45:26 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 10 Jul 2012 18:45:26 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of SRS0=939mB+=FL=wonderly.org=gregg@yourhostingaccount.com designates 65.254.253.122 as permitted sender) Received: from [65.254.253.122] (HELO mailout15.yourhostingaccount.com) (65.254.253.122) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 10 Jul 2012 18:45:18 +0000 Received: from mailscan18.yourhostingaccount.com ([10.1.15.18] helo=mailscan18.yourhostingaccount.com) by mailout15.yourhostingaccount.com with esmtp (Exim) id 1SofQf-0005OT-Bq for dev@river.apache.org; Tue, 10 Jul 2012 14:44:57 -0400 Received: from impout01.yourhostingaccount.com ([10.1.55.1] helo=impout01.yourhostingaccount.com) by mailscan18.yourhostingaccount.com with esmtp (Exim) id 1SofQf-0006qX-Dl for dev@river.apache.org; Tue, 10 Jul 2012 14:44:57 -0400 Received: from authsmtp05.yourhostingaccount.com ([10.1.18.5]) by impout01.yourhostingaccount.com with NO UCE id Yikx1j00106ZpSa01ikxCB; Tue, 10 Jul 2012 14:44:57 -0400 X-Authority-Analysis: v=2.0 cv=aoEw+FlV c=1 sm=1 a=jDvvY1EzangoaYHvy/9d2w==:17 a=UrrH0j-4gkYA:10 a=1i4tViBjaPMA:10 a=GMbBp1EkL-YA:10 a=kj9zAlcOel0A:10 a=HCB_ZTjGAAAA:8 a=XczjXgE3AAAA:8 a=mV9VRH-2AAAA:8 a=YCrJHi1GAJGAbdQX86IA:9 a=CjuIK1q_8ugA:10 a=69ESBcSXyHUA:10 a=INavXIfNr06EStYu:21 a=euxWKNgzWbHpaLgz:21 a=9O1N9SI/4ZbXZMy1Cl+5fA==:117 X-EN-OrigOutIP: 10.1.18.5 X-EN-IMPSID: Yikx1j00106ZpSa01ikxCB Received: from [129.244.15.179] by authsmtp05.yourhostingaccount.com with esmtpa (Exim) id 1SofQM-0000CF-7B for dev@river.apache.org; Tue, 10 Jul 2012 14:44:38 -0400 Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Apple Message framework v1278) Subject: Re: Question about LeaseRenewalManager and renewDuration From: Gregg Wonderly In-Reply-To: <1341935376.2131.107.camel@cameron> Date: Tue, 10 Jul 2012 13:44:33 -0500 Content-Transfer-Encoding: quoted-printable Message-Id: References: <3D9D7375DA3CC046A80F455DE3379AFE055EE2@DBXPRD0410MB372.eurprd04.prod.outlook.com> <1341927383.2131.38.camel@cameron> <3D9D7375DA3CC046A80F455DE3379AFE056114@DBXPRD0410MB372.eurprd04.prod.outlook.com> <1341935376.2131.107.camel@cameron> To: dev@river.apache.org X-Mailer: Apple Mail (2.1278) X-EN-UserInfo: 5bac21c6012e8295aaee92c67842fba3:d1e94006e19829b2b3cf849ab9ff0f3c X-EN-AuthUser: greggwon Sender: Gregg Wonderly X-EN-OrigIP: 129.244.15.179 X-EN-OrigHost: unknown Recall, that under the covers there is also all the OS network stack = behaviors. What is the TCP SYN timeout, for example; i.e. how long will = a TCP connect request, which will eventually fail, take before failing? I think it's important to understand that unless you are on either end = of a TCP connection, with timeout and keep alive settings for that = connection, turned down to short intervals, that you're going to be = mystified at the longer than expected timing of most failure detections. Subclassing the appropriate endpoint class, and adjusting it's behavior = and using that on your registrar may be part of what you need to do, to = see quick notifications. Gregg Wonderly On Jul 10, 2012, at 10:49 AM, Greg Trasuk wrote: >=20 > On Tue, 2012-07-10 at 10:14, Itai Frenkel wrote: >>>> Are you sure about that? =20 >> Looking at RegistrarImpl when ThrowableConstants.retryable(e) returns = BAD_OBJECT, it rethrows only if (e instanceof Error), otherwise it = cancels the lease. Since ConnectException is not an Error the lease = would be canceled. >> Why is the Error check being performed ? >>=20 > ThrowableConstants.retryable(e) only returns BAD_OBJECT if it receives = a > definite response from the remote endpoint. For a comm failure, it > should return INDEFINITE. Having said that, the logic seems to favour > declaring an exception "Definite" where it might be arguable. For > instance, it will declare BAD_OBJECT in the case of a "No route to = host" > exception, which arguably could be temporary, for instance if a router > goes offline. >=20 >>>> Personally, I'd use an internal timer on the client side that says = "if I don't receive any events for a given time, I'll cancel the current = lease and re-register". =20 >> That requires the Registrar to periodically send probe notifications. = The number of real world notifications could fluctuate from zero to high = load and cannot be trusted without probe notifications. >>=20 > Might be an interesting improvement if a client could request a > heartbeat or supervisory message from the registrar. But my point = above > was that if the events are not coming fast enough to satisfy a > reasonable "liveness" timeout, then it's probably not a big problem if > the client simply cancels the lease and re-registers. So you could > effectively implement your own heartbeat. >=20 > Alternately (subject to exploring the loading and the number of = clients) > you could create a service that does nothing but registers, then = updates > its service attributes periodically, which would have the effect of > generating registrar messages. Starting to get a little complicated = and > indirect, though. >=20 > In the end, however, it seems like your trying to have the client find > out that it's not receiving registrar notifications. I can't think of > any better evidence than "you're not receiving registrar = notifications". >=20 > Cheers, >=20 > Greg. >=20 >> Thanks, >> Itai >>=20 >> -----Original Message----- >> From: Greg Trasuk [mailto:trasukg@stratuscom.com]=20 >> Sent: Tuesday, July 10, 2012 4:36 PM >> To: dev@river.apache.org >> Subject: Re: Question about LeaseRenewalManager and renewDuration >>=20 >>=20 >> On Tue, 2012-07-10 at 06:41, Itai Frenkel wrote: >> >>> Background Information: >>> The motivation for this is the way the Registrar handles event = notifications. >>> When the Registrar fails to send a notification to a listener due to = a=20 >>> temporary network glitch, it assumes the listener is no longer = available and cancels the event lease. >>=20 >> Are you sure about that? Looking through = com.sun.jini.reggie.RegistrarImpl, it appears that when an exception = occurs during event notification, the code tries to categorize the = exception as either "definite" (no such event, no such object, etc) or = "indefinite" (communications failure). Then it only cancels the lease = on a definite exception. >>=20 >> In other words, the lease is maintained in the case of a temporary = network failure. After all, that's the whole point of the lease: it = represents an agreement between the client and service that resources = are going to be maintained for a definite time period. =20 >>=20 >> Personally, I'd use an internal timer on the client side that says = "if I don't receive any events for a given time, I'll cancel the current = lease and re-register". If the events are that quiet, then clearly the = registrar is not that heavily loaded, so the overhead of cancelling the = lease and creating a new registration should not be too bad. You'd want = to test it under simulated load, of course. >>=20 >> Cheers, >>=20 >> Greg. >> -- >> Greg Trasuk, President >> StratusCom Manufacturing Systems Inc. - We use information technology = to solve business problems on your plant floor. >> http://stratuscom.com >>=20 >>=20 >>=20 >=20