Return-Path: X-Original-To: apmail-river-dev-archive@www.apache.org Delivered-To: apmail-river-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 6478510413 for ; Sat, 4 Jan 2014 06:27:39 +0000 (UTC) Received: (qmail 31501 invoked by uid 500); 4 Jan 2014 06:27:35 -0000 Delivered-To: apmail-river-dev-archive@river.apache.org Received: (qmail 31489 invoked by uid 500); 4 Jan 2014 06:27:32 -0000 Mailing-List: contact dev-help@river.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@river.apache.org Delivered-To: mailing list dev@river.apache.org Received: (qmail 31478 invoked by uid 99); 4 Jan 2014 06:27:28 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 04 Jan 2014 06:27:28 +0000 X-ASF-Spam-Status: No, hits=0.5 required=5.0 tests=SPF_PASS,URI_NOVOWEL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [207.57.65.70] (HELO zeus.net.au) (207.57.65.70) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 04 Jan 2014 06:27:22 +0000 Received: (qmail 30198 invoked by uid 16710); 4 Jan 2014 06:26:57 -0000 Received: from unknown (HELO [10.49.38.47]) (jini@[49.181.196.152]) (envelope-sender ) by 207.57.65.70 (qmail-ldap-1.03) with AES256-SHA encrypted SMTP for ; 4 Jan 2014 06:26:57 -0000 Message-ID: <52C7A9A6.2010302@zeus.net.au> Date: Sat, 04 Jan 2014 16:26:46 +1000 From: Peter Firmstone User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-GB; rv:1.9.2.28) Gecko/20120306 Thunderbird/3.1.20 MIME-Version: 1.0 To: dev@river.apache.org Subject: Re: Build failed in Jenkins: river-qa-refactor-win #45 References: <2020826615.697.1388731850833.JavaMail.jenkins@crius> <1946709222.865.1388795255251.JavaMail.jenkins@crius> <52C7835F.3000408@zeus.net.au> <8C54B22F-1B2C-4DFA-99AD-37727121814F@stratuscom.com> In-Reply-To: Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 8bit X-Virus-Checked: Checked by ClamAV on apache.org On 4/01/2014 3:40 PM, Greg Trasuk wrote: > I accidentally deleted something from the top of that message... > > I checked out the qa_refactor branch, and compared the 2.2 implementation of com.sun.jini.RegistrarImpl with the qa_refactor implementation. > In the 2.2 branch, Reggie uses an instance of TaskManager that is looked up through a Configuration instance. In the qa_refactor branch, you�ve replaced that with two separate hard-coded instances of ThreadPoolExecutor. > > In the 2.2 implementation, it was possible to have Reggie share a TaskManager with other services, or with one or more ServiceDiscoveryManagers, by creating the appropriate configuration file. That�s no longer possible. Have a look at Mahalo, it shows an ExecutorService set by configuration, but with a wrapper class to bolt on required functionality. That's how Reggie will look when it's finished. > Further, in a container scenario, we�d like to move away from having individual services creating their own threads, towards a shared work manager. Ideally, we would have Reggie and all other services use a shared work manager rather than creating their own threads, as they currently do. That way, the container can manage and prioritize executions appropriately (e.g app A�s tasks take priority over app B�s), and they can all share a thread pool. Removing the TaskManager usage moves us further from that goal. I suspect that TaskManager might have to be extended to do this properly, and I certainly would prefer that it were an interface rather than a concrete class, but it�s a decent starting point. We'll be able to do something similar with ExecutorService, most of the threads have been converted to Runnable's that are passed as an argument to Thread's constructor (instead of extending Thread), they could be passed to an executor instead. However it's also worth noting that there are cases where it isn't recommended for certain tasks to share an Executor. The other issue is shutdown, if you shutdown one service and it share's a TaskManager or ExecutorService with another service, then the other service will have a terminated TaskManager or ExecutorService. We really do need to dump TaskManager, it can't hold a candle to Doug Lee's Executor framework. I read somewhere the Jini development team had planned to replace TaskManager due to issues with Task.runAfter? When I profile the stress tests now, the hotspot's are Socket's and there's very little monitor contention. Regards, Peter. > On Jan 4, 2014, at 12:18 AM, Greg Trasuk wrote: > >> I�ll also point out Patricia�s recent statement that TaskManager should be reasonably efficient for small task queues, but less efficient for larger task queues. We don�t have solid evidence that the task queues ever get large. Hence, the assertion that �TaskManager doesn�t scale� is meaningless. If real usage never requires a large task queue, then scalability isn�t an issue, and we don�t know whether it ever needs a large task queue. >> >> In any case, removing TaskManager and replacing it with hard-coded ThreadPoolExecutors moves us farther away from having the capability of a shared work queue. So I�m not in favour of this change. I haven�t looked at the other services or utility classes, but if the changes are similar, I�m also not in favour. You�re introducing changes that introduce test failures (which is why you�re asking for help) without a good reason. You�re never going to ship this code unless you stop modifying it. >> >> Also, when you say below, >>> I'm developing an ExecutorService wrapper that retry's failed tasks in org.apache.river.impl.thread.SerialExecutorService, by not removing a task from it's queue until it completes successfully, it prevents any dependant tasks from running, I would like to use this as a replacement for TaskManager and RetryTask. >> �be careful! You�re getting into the same difficult area as transactional semantics around messaging. Will you need to provide a �dead task� queue? Do you need to set a limit on how many times a task get retried? What happens when that limit is exceeded? Do all tasks have the same limit? Should a task get notified when it�s exceeded the retry limit? How long should you wait between retries? Is that number the same for all tasks. Is there some kind of alarm or notification when tasks end up being retried, or when the dead task queue becomes full? >> >> Sometimes it�s best not to try to abstract-away all complexity. >> >> Greg. >> >> On Jan 3, 2014, at 10:43 PM, Peter Firmstone wrote: >> >>> ServiceDiscoveryManager is now the only class that utilises TaskManager and RetryTask. JoinManager still uses TaskManager but not RetryTask. See River-344 for an explanation of the problem. >>> >>> Most instances of TaskManager in qa-refactor have been replaced with ExecutorService, RetryTask now implements RunnableFuture and can be cancelled by Future.cancel from the ExecutorService. >>> >>> I'm developing an ExecutorService wrapper that retry's failed tasks in org.apache.river.impl.thread.SerialExecutorService, by not removing a task from it's queue until it completes successfully, it prevents any dependant tasks from running, I would like to use this as a replacement for TaskManager and RetryTask. >>> >>> Can anyone spare time to review, suggest alternatives, or improvements? >>> >>> Thanks in advance, >>> >>> Peter. >>> >>> Failed com_sun_jini_test_impl_servicediscovery_event_DiscardDownReDiscover.td >>> Test Failed: com.sun.jini.qa.harness.TestException: discovery failed -- waited 30 seconds (0 minutes) -- 2 discovery event(s) expected, 1 discovery event(s) received >>> >>> Failed com_sun_jini_test_impl_servicediscovery_event_DiscardServiceDown.td >>> Test Failed: com.sun.jini.qa.harness.TestException: discovery failed -- waited 30 seconds (0 minutes) -- 2 discovery event(s) expected, 0 discovery event(s) received >>> >>> Failed com_sun_jini_test_impl_servicediscovery_event_DiscardServiceUp.td >>> Test Failed: com.sun.jini.qa.harness.TestException: discovery failed -- waited 30 seconds (0 minutes) -- 2 discovery event(s) expected, 0 discovery event(s) received >>> >>> Failed com_sun_jini_test_impl_servicediscovery_event_LookupTaskRace.td >>> Test Failed: com.sun.jini.qa.harness.TestException: discovery failed -- waited 30 seconds (0 minutes) -- 2 discovery event(s) expected, 0 discovery event(s) received >>> >>> Failed com_sun_jini_test_impl_servicediscovery_event_ReRegisterBadEquals.td >>> Test Failed: com.sun.jini.qa.harness.TestException: discovery failed -- waited 30 seconds (0 minutes) -- 4 discovery event(s) expected, 0 discovery event(s) received >>> >>> Failed com_sun_jini_test_impl_servicediscovery_event_ReRegisterGoodEquals.td >>> Test Failed: com.sun.jini.qa.harness.TestException: discovery failed -- waited 30 seconds (0 minutes) -- 4 discovery event(s) expected, 0 discovery event(s) received >>> >>> Failed com_sun_jini_test_impl_servicediscovery_event_ServiceDiscardCacheTerminate.td >>> Test Failed: com.sun.jini.qa.harness.TestException: discovery failed -- waited 30 seconds (0 minutes) -- 4 discovery event(s) expected, 0 discovery event(s) received >>> >>> Failed com_sun_jini_test_spec_servicediscovery_cache_CacheDiscard.td >>> Test Failed: com.sun.jini.qa.harness.TestException: discovery failed -- waited 30 seconds (0 minutes) -- 2 discovery event(s) expected, 0 discovery event(s) received >>> >>> Failed com_sun_jini_test_spec_servicediscovery_cache_CacheLookup.td >>> Test Failed: com.sun.jini.qa.harness.TestException: discovery failed -- waited 30 seconds (0 minutes) -- 2 discovery event(s) expected, 0 discovery event(s) received >>> >>> Failed com_sun_jini_test_spec_servicediscovery_lookup_Lookup.td >>> Test Failed: com.sun.jini.qa.harness.TestException: discovery failed -- waited 30 seconds (0 minutes) -- 2 discovery event(s) expected, 0 discovery event(s) received >>> >>> Failed com_sun_jini_test_spec_servicediscovery_lookup_LookupMax.td >>> Test Failed: com.sun.jini.qa.harness.TestException: discovery failed -- waited 30 seconds (0 minutes) -- 3 discovery event(s) expected, 2 discovery event(s) received >>> >>> Failed com_sun_jini_test_spec_servicediscovery_lookup_LookupMaxFilter.td >>> Test Failed: com.sun.jini.qa.harness.TestException: discovery failed -- waited 30 seconds (0 minutes) -- 3 discovery event(s) expected, 0 discovery event(s) received >>> >>> Failed com_sun_jini_test_spec_servicediscovery_lookup_LookupMinEqualsMax.td >>> Test Failed: com.sun.jini.qa.harness.TestException: discovery failed -- waited 30 seconds (0 minutes) -- 3 discovery event(s) expected, 0 discovery event(s) received >>> >>> Failed com_sun_jini_test_spec_servicediscovery_lookup_LookupMinMaxNoBlockFilter.td >>> Test Failed: com.sun.jini.qa.harness.TestException: discovery failed -- waited 30 seconds (0 minutes) -- 3 discovery event(s) expected, 0 discovery event(s) received >>> >>> Failed com_sun_jini_test_spec_servicediscovery_lookup_LookupWait.td >>> Test Failed: com.sun.jini.qa.harness.TestException: discovery failed -- waited 30 seconds (0 minutes) -- 2 discovery event(s) expected, 0 discovery event(s) received >>> >>> Failed com_sun_jini_test_spec_servicediscovery_lookup_LookupWaitFilter.td >>> Test Failed: com.sun.jini.qa.harness.TestException: discovery failed -- waited 30 seconds (0 minutes) -- 2 discovery event(s) expected, 1 discovery event(s) received >>> >>> Failed com_sun_jini_test_spec_servicediscovery_lookup_LookupWaitNoBlock.td >>> Test Failed: com.sun.jini.qa.harness.TestException: discovery failed -- waited 30 seconds (0 minutes) -- 2 discovery event(s) expected, 0 discovery event(s) received >>> >>> >>> >>> On 4/01/2014 10:27 AM, Apache Jenkins Server wrote: >>>> See >>>> >>>> ------------------------------------------ >>>> [...truncated 15733 lines...] >>>> [java] Test Passed: OK >>>> [java] >>>> [java] ----------------------------------------- >>>> [java] com/sun/jini/test/spec/javaspace/conformance/snapshot/SnapshotTransactionTakeIfExistsTest.td >>>> [java] Test Passed: OK >>>> [java] >>>> [java] ----------------------------------------- >>>> [java] com/sun/jini/test/spec/javaspace/conformance/snapshot/SnapshotTransactionTakeIfExistsWaitTest.td >>>> [java] Test Passed: OK >>>> [java] >>>> [java] ----------------------------------------- >>>> [java] com/sun/jini/test/spec/javaspace/conformance/snapshot/SnapshotTransactionTakeNO_WAITTest.td >>>> [java] Test Passed: OK >>>> [java] >>>> [java] ----------------------------------------- >>>> [java] com/sun/jini/test/spec/javaspace/conformance/snapshot/SnapshotTransactionTakeReadTest.td >>>> [java] Test Passed: OK >>>> [java] >>>> [java] ----------------------------------------- >>>> [java] com/sun/jini/test/spec/javaspace/conformance/snapshot/SnapshotTransactionTakeTest.td >>>> [java] Test Passed: OK >>>> [java] >>>> [java] ----------------------------------------- >>>> [java] com/sun/jini/test/spec/javaspace/conformance/snapshot/SnapshotTransactionTakeWaitTest.td >>>> [java] Test Passed: OK >>>> [java] >>>> [java] ----------------------------------------- >>>> [java] com/sun/jini/test/spec/javaspace/conformance/snapshot/SnapshotTransactionWriteLeaseANYTest.td >>>> [java] Test Passed: OK >>>> [java] >>>> [java] ----------------------------------------- >>>> [java] com/sun/jini/test/spec/javaspace/conformance/snapshot/SnapshotTransactionWriteLeaseFOREVERTest.td >>>> [java] Test Passed: OK >>>> [java] >>>> [java] ----------------------------------------- >>>> [java] com/sun/jini/test/spec/javaspace/conformance/snapshot/SnapshotTransactionWriteNegativeLeaseTest.td >>>> [java] Test Passed: OK >>>> [java] >>>> [java] ----------------------------------------- >>>> [java] com/sun/jini/test/spec/javaspace/conformance/snapshot/SnapshotTransactionWriteTakeIfExistsNotifyTest.td >>>> [java] Test Passed: OK >>>> [java] >>>> [java] ----------------------------------------- >>>> [java] com/sun/jini/test/spec/javaspace/conformance/snapshot/SnapshotTransactionWriteTakeIfExistsTest.td >>>> [java] Test Passed: OK >>>> [java] >>>> [java] ----------------------------------------- >>>> [java] com/sun/jini/test/spec/javaspace/conformance/snapshot/SnapshotTransactionWriteTakeNotifyTest.td >>>> [java] Test Passed: OK >>>> [java] >>>> [java] ----------------------------------------- >>>> [java] com/sun/jini/test/spec/javaspace/conformance/snapshot/SnapshotTransactionWriteTakeTest.td >>>> [java] Test Passed: OK >>>> [java] >>>> [java] ----------------------------------------- >>>> [java] com/sun/jini/test/spec/javaspace/conformance/snapshot/SnapshotTransactionWriteTest.td >>>> [java] Test Passed: OK >>>> [java] >>>> [java] ----------------------------------------- >>>> [java] com/sun/jini/test/spec/javaspace/conformance/snapshot/SnapshotWriteLeaseANYTest.td >>>> [java] Test Passed: OK >>>> [java] >>>> [java] ----------------------------------------- >>>> [java] com/sun/jini/test/spec/javaspace/conformance/snapshot/SnapshotWriteLeaseFOREVERTest.td >>>> [java] Test Passed: OK >>>> [java] >>>> [java] ----------------------------------------- >>>> [java] com/sun/jini/test/spec/javaspace/conformance/snapshot/SnapshotWriteNegativeLeaseTest.td >>>> [java] Test Passed: OK >>>> [java] >>>> [java] ----------------------------------------- >>>> [java] com/sun/jini/test/spec/javaspace/conformance/snapshot/SnapshotWriteTest.td >>>> [java] Test Passed: OK >>>> [java] >>>> [java] ----------------------------------------- >>>> [java] com/sun/jini/test/impl/mahalo/AdminIFShutdownTest.td >>>> [java] Test Passed: OK >>>> [java] >>>> [java] ----------------------------------------- >>>> [java] com/sun/jini/test/impl/mahalo/AdminIFTest.td >>>> [java] Test Passed: OK >>>> [java] >>>> [java] ----------------------------------------- >>>> [java] com/sun/jini/test/impl/mahalo/LeaseExpireCancelTest.td >>>> [java] Test Passed: OK >>>> [java] >>>> [java] ----------------------------------------- >>>> [java] com/sun/jini/test/impl/mahalo/LeaseExpireRenewTest.td >>>> [java] Test Passed: OK >>>> [java] >>>> [java] ----------------------------------------- >>>> [java] com/sun/jini/test/impl/mahalo/LeaseMapTest.td >>>> [java] Test Passed: OK >>>> [java] >>>> [java] ----------------------------------------- >>>> [java] com/sun/jini/test/impl/mahalo/LeaseTest.td >>>> [java] Test Passed: OK >>>> [java] >>>> [java] ----------------------------------------- >>>> [java] com/sun/jini/test/impl/mahalo/MahaloCreateShutdownTest.td >>>> [java] Test Passed: OK >>>> [java] >>>> [java] ----------------------------------------- >>>> [java] com/sun/jini/test/impl/mahalo/MahaloIFTest.td >>>> [java] Test Passed: OK >>>> [java] >>>> [java] ----------------------------------------- >>>> [java] com/sun/jini/test/impl/mahalo/MahaloImplReadyStateTest.td >>>> [java] Test Skipped: verifiers are: com.sun.jini.test.impl.mercury.ActivatableMercuryVerifier com.sun.jini.qa.harness.SkipConfigTestVerifier >>>> [java] ----------------------------------------- >>>> [java] com/sun/jini/test/impl/mahalo/NestableServerTransactionCreatedToStringTest.td >>>> [java] Test Passed: OK >>>> [java] >>>> [java] ----------------------------------------- >>>> [java] com/sun/jini/test/impl/mahalo/NestableTransactionCreatedToStringTest.td >>>> [java] Test Passed: OK >>>> [java] >>>> [java] ----------------------------------------- >>>> [java] com/sun/jini/test/impl/mahalo/PrepareAndCommitExceptionTest.td >>>> [java] Test Passed: OK >>>> [java] >>>> [java] ----------------------------------------- >>>> [java] com/sun/jini/test/impl/mahalo/PrepareAndCommitExceptionTest2.td >>>> [java] Test Passed: OK >>>> [java] >>>> [java] ----------------------------------------- >>>> [java] com/sun/jini/test/impl/mahalo/PrepareAndCommitExceptionTest3.td >>>> [java] Test Passed: OK >>>> [java] >>>> [java] ----------------------------------------- >>>> [java] com/sun/jini/test/impl/mahalo/PrepareAndCommitExceptionTest4.td >>>> [java] Test Passed: OK >>>> [java] >>>> [java] ----------------------------------------- >>>> [java] com/sun/jini/test/impl/mahalo/PrepareAndCommitExceptionTest5.td >>>> [java] Test Passed: OK >>>> [java] >>>> [java] ----------------------------------------- >>>> [java] com/sun/jini/test/impl/mahalo/RandomStressTest.td >>>> [java] Test Passed: OK >>>> [java] >>>> [java] ----------------------------------------- >>>> [java] com/sun/jini/test/impl/mahalo/ServerTransactionEqualityTest.td >>>> [java] Test Passed: OK >>>> [java] >>>> [java] ----------------------------------------- >>>> [java] com/sun/jini/test/impl/mahalo/ServerTransactionToStringTest.td >>>> [java] Test Passed: OK >>>> [java] >>>> [java] ----------------------------------------- >>>> [java] com/sun/jini/test/impl/mahalo/TransactionCreatedToStringTest.td >>>> [java] Test Passed: OK >>>> [java] >>>> [java] ----------------------------------------- >>>> [java] com/sun/jini/test/impl/mahalo/TransactionManagerCreatedToStringTest.td >>>> [java] Test Passed: OK >>>> [java] >>>> [java] ----------------------------------------- >>>> [java] com/sun/jini/test/impl/mahalo/TxnMgrImplNullActivationConfigEntries.td >>>> [java] Test Skipped: verifiers are: com.sun.jini.test.impl.mahalo.ActivatableMahaloVerifier >>>> [java] ----------------------------------------- >>>> [java] com/sun/jini/test/impl/mahalo/TxnMgrImplNullConfigEntries.td >>>> [java] Test Passed: OK >>>> [java] >>>> [java] ----------------------------------------- >>>> [java] com/sun/jini/test/impl/mahalo/TxnMgrImplNullRecoveredLocators.td >>>> [java] Test Skipped: verifiers are: com.sun.jini.test.impl.mahalo.ActivatableMahaloVerifier >>>> [java] ----------------------------------------- >>>> [java] com/sun/jini/test/impl/mahalo/TxnMgrProxyEqualityTest.td >>>> [java] Test Passed: OK >>>> [java] >>>> [java] ----------------------------------------- >>>> [java] com/sun/jini/test/spec/txnmanager/AsynchAbortOnCommitTest.td >>>> [java] Test Passed: OK >>>> [java] >>>> [java] ----------------------------------------- >>>> [java] com/sun/jini/test/spec/txnmanager/AsynchAbortOnPrepareTest.td >>>> [java] Test Passed: OK >>>> [java] >>>> [java] ----------------------------------------- >>>> [java] com/sun/jini/test/spec/txnmanager/CommitExpiredTest.td >>>> [java] Test Passed: OK >>>> [java] >>>> [java] ----------------------------------------- >>>> [java] com/sun/jini/test/spec/txnmanager/CommitTimeoutTest.td >>>> [java] Test Passed: OK >>>> [java] >>>> [java] ----------------------------------------- >>>> [java] com/sun/jini/test/spec/txnmanager/GetStateTest.td >>>> [java] Test Passed: OK >>>> [java] >>>> [java] ----------------------------------------- >>>> [java] com/sun/jini/test/spec/txnmanager/JoinIdempotentTest.td >>>> [java] Test Passed: OK >>>> [java] >>>> [java] ----------------------------------------- >>>> [java] com/sun/jini/test/spec/txnmanager/JoinWhileActiveTest.td >>>> [java] Test Passed: OK >>>> [java] >>>> [java] ----------------------------------------- >>>> [java] com/sun/jini/test/spec/txnmanager/ManyParticipantsTest.td >>>> [java] Test Passed: OK >>>> [java] >>>> [java] ----------------------------------------- >>>> [java] com/sun/jini/test/spec/txnmanager/PrepareTimeoutTest.td >>>> [java] Test Passed: OK >>>> [java] >>>> [java] ----------------------------------------- >>>> [java] com/sun/jini/test/spec/txnmanager/RollBackErrorTest.td >>>> [java] Test Passed: OK >>>> [java] >>>> [java] ----------------------------------------- >>>> [java] com/sun/jini/test/spec/txnmanager/RollForwardErrorTest.td >>>> [java] Test Passed: OK >>>> [java] >>>> [java] ----------------------------------------- >>>> [java] com/sun/jini/test/spec/txnmanager/TwoPhaseTest.td >>>> [java] Test Passed: OK >>>> [java] >>>> [java] ----------------------------------------- >>>> [java] >>>> [java] # of tests started = 1406 >>>> [java] # of tests completed = 1406 >>>> [java] # of tests skipped = 52 >>>> [java] # of tests passed = 1388 >>>> [java] # of tests failed = 18 >>>> [java] >>>> [java] ----------------------------------------- >>>> [java] >>>> [java] Date finished: >>>> [java] Fri Jan 03 16:27:03 PST 2014 >>>> [java] Time elapsed: >>>> [java] 59325 seconds >>>> [java] >>>> [java] Java Result: 1 >>>> >>>> collect-result: >>>> [copy] Copying 1 file to >>>> [copy] Copying 1 file to >>>> [zip] Building zip: Server 2008 R2-1.7.0.zip >>>> >>>> BUILD FAILED >>>> :2109: The following error occurred while executing this line: >>>> :406: The following error occurred while executing this line: >>>> :380: condition satisfied >>>> >>>> Total time: 996 minutes 9 seconds >>>> Build step 'Invoke Ant' marked build as failure >>>> Archiving artifacts