Return-Path: X-Original-To: apmail-river-dev-archive@www.apache.org Delivered-To: apmail-river-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id E2F60109F3 for ; Wed, 8 Jan 2014 17:29:08 +0000 (UTC) Received: (qmail 10858 invoked by uid 500); 8 Jan 2014 17:29:07 -0000 Delivered-To: apmail-river-dev-archive@river.apache.org Received: (qmail 10731 invoked by uid 500); 8 Jan 2014 17:29:03 -0000 Mailing-List: contact dev-help@river.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@river.apache.org Delivered-To: mailing list dev@river.apache.org Received: (qmail 10722 invoked by uid 99); 8 Jan 2014 17:29:02 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 08 Jan 2014 17:29:02 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of gergg@cox.net designates 68.230.241.147 as permitted sender) Received: from [68.230.241.147] (HELO fed1rmfepo202.cox.net) (68.230.241.147) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 08 Jan 2014 17:28:56 +0000 Received: from fed1rmimpo210 ([68.230.241.161]) by fed1rmfepo202.cox.net (InterMail vM.8.01.05.09 201-2260-151-124-20120717) with ESMTP id <20140108172836.VVEB4705.fed1rmfepo202.cox.net@fed1rmimpo210> for ; Wed, 8 Jan 2014 12:28:36 -0500 Received: from [192.168.20.164] ([76.76.133.74]) by fed1rmimpo210 with cox id BVUV1n0011cUBck01VUWNL; Wed, 08 Jan 2014 12:28:35 -0500 X-CT-Class: Clean X-CT-Score: 0.00 X-CT-RefID: str=0001.0A020205.52CD8AC4.000C,ss=1,re=0.000,fgs=0 X-CT-Spam: 0 X-Authority-Analysis: v=2.0 cv=XIDuv3dE c=1 sm=1 a=heBF41qg0YAiVTIN72KKpA==:17 a=G8Uczd0VNMoA:10 a=N659UExz7-8A:10 a=kviXuzpPAAAA:8 a=x_lAWBps_UQA:10 a=0LiwH3idAAAA:8 a=mV9VRH-2AAAA:8 a=XXvSNfZME133PiiP-U4A:9 a=pILNOxqGKmIA:10 a=wBIjF1_GbPFjjI7n:21 a=cDNYbKtPapuYKlkH:21 a=heBF41qg0YAiVTIN72KKpA==:117 X-CM-Score: 0.00 Authentication-Results: cox.net; auth=pass (PLAIN) smtp.auth=gergg@cox.net Content-Type: text/plain; charset=windows-1252 Mime-Version: 1.0 (Mac OS X Mail 7.1 \(1827\)) Subject: Re: TaskManager as an ExecutorService From: Gregg Wonderly In-Reply-To: Date: Wed, 8 Jan 2014 11:28:28 -0600 Content-Transfer-Encoding: quoted-printable Message-Id: References: <52C7E834.2070201@zeus.net.au> <1388879476.6007.11.camel@Nokia-N900> To: dev@river.apache.org X-Mailer: Apple Mail (2.1827) X-Virus-Checked: Checked by ClamAV on apache.org =46rom the very early days of Jini, I have not had good luck with = ServiceDiscoveryManager. I actually have perhaps 3 different version of = this behavior coded into various applications in different ways, because = SDM just did not seem to provide sane behavior for me, ever. I am not = sure that I have spare cycles at this point to engage in development on = River, because I am off doing some other non-Java work at this point, = and I won=92t be as productive doing that, as I=92d like to be. I would like to get involved with River development, but I am not at a = place that I can do that right now. I don=92t want to just be a = non-participant, so I am trying to read through issues here and offer = whatever experiences I can recall as examples of why the River codebase, = truly is in need of all the work that Peter has been slaving through. In many cases, people don=92t seem to see problems. In service only = environments, perhaps things join quickly enough and you are satisfied = with what is happening. In my applications, things are heavily client = based, with clients being started and stopped continuously by people = trying to access serviceUIs to do a task. In that environment, the = behaviors of all this broken and inefficient locking/synchronization, it = becomes very painful and obvious, quite quickly that things are busted, = big time. Gregg On Jan 6, 2014, at 2:14 AM, Peter wrote: > Gregg, >=20 > Are you able to help out with ServiceDiscoveryManager? >=20 > Fixing SDM would make a huge stability improvement. >=20 > Guava has an interface called ListenableFuture, which sounds like a = possible candidate, listeners are registered and notified in the event = of failure or success. >=20 > The Guava site claims it makes it possible to have complex chains of = asynchronous operations. >=20 > Regards, >=20 > Peter. >=20 > ----- Original message ----- >> I think the better choice is to not try and handle failure with retry = at >> all. Instead, we should use API which allows the dependent task = to >> know whether it=92s dependent has completed or failed. It can then = report >> failure if its dependent failed, or submit its own work to be = executed >> in the queue. =20 >>=20 >> This linking of application behavior into utility APIs is just not = good, >> testable design. It requires that behaviors on the outside of the = pool, >> flow through to other uses of the API in non-programatic design ways. >>=20 >> Gregg Wonderly >> =20 >> On Jan 4, 2014, at 7:46 PM, Peter wrote: >>=20 >>> For a moment, lets consider how we might fix TaskManager, while >>> retaining the existing Task.runAfter method and how the fix would >>> impact TaskManager's users. >>>=20 >>> TaskManager, like most thread pools has a queue. >>>=20 >>> Before a task is removed from the queue for execution, it is asked = if >>> it should "runAfter" any other task present in the queue, while this >>> occurs, the queue is locked. >>>=20 >>> Each task is removed from the queue before it executes. >>>=20 >>> Now what happens if a task fails to execute, or it's thread is >>> suspended by the OS scheduler and is not in the queue during = execution >>> and another task/s that depend on it need to "runAfter" are allowed = to >>> execute before it completes? >>>=20 >>> Because a task doesn't know which tasks depend upon it running = first, >>> the only way to fix TaskManager is to only remove a task after it >>> completes successfully. >>>=20 >>> Now TaskManager has no scheduling capability, which means that the = fix >>> will require the task to both remain in the queue and consume a pool >>> thread until it successfully completes because once the Task.run >>> method completes it is assumed to have completed successfully. >>>=20 >>> RetryTask and WakupManager would also need to be rewritten to >>> accommodate this requirement. >>>=20 >>> The result will be correctly ordered execution of tasks, with >>> increased thread & memory consumption as well as significantly = reduced >>> throughput. >>>=20 >>> The reduced throughput could also help mask other issues by reducing >>> concurrency. >>>=20 >>> Regards, >>>=20 >>> Peter. >>>=20 >>>=20 >>> ----- Original message ----- >>>> The problem is with TaskManager's public api method Task.runAfter. = =20 >>>> This is well documented in River-344. >>>>=20 >>>> The fix requires changing every class that uses it >>>>=20 >>>> As a thread pool TaskManager is correct provided that no ordering >>>> dependencies exist between tasks. >>>>=20 >>>> TaskManager doesn't compare to Doug Lee's ExecutorService >>>> implementations, it should be consigned to history, lets clean our >>>> code up. >>>>=20 >>>> What configuration choice did we originally have anyway? Any >>>> Executor you like as long as it's TaskManager? >>>>=20 >>>> TaskManager is not part of our public api, it's an implementation >>>> detail in com.sun.* >>>>=20 >>>> Regards, >>>>=20 >>>> Peter. >>>> ----- Original message ----- >>>>>=20 >>>>> I=92d like you to make a reasonable case for why TaskManager needs >>>>> to be replaced, requiring changes to many other classes that >>>>> depend on TaskManager, rather than stating what the problem is >>>>> with TaskManager and seeking to fix it, which would only affect >>>>> TaskManager and not require modifying and then debugging other >>>>> code. >>>>>=20 >>>>> Greg. >>>>>=20 >>>>> On Jan 4, 2014, at 5:53 AM, Peter Firmstone >>>>> wrote: >>>>>=20 >>>>>> Would you like me to add this class, so that existing >>>>>> configurations utilising a TaskManager can also be used? = This >>>>>> might be useful for retaining backward compatibility with >>>>>> existing configurations? >>>>>>=20 >>>>>> Regards, >>>>>>=20 >>>>>> Peter. >>>>>>=20 >>>>>>=20 >>>>>> /* >>>>>> * Licensed to the Apache Software Foundation (ASF) under one >>>>>> * or more contributor license agreements. See the NOTICE = file >>>>>> * distributed with this work for additional information >>>>>> * regarding copyright ownership. The ASF licenses this file >>>>>> * to you under the Apache License, Version 2.0 (the >>>>>> * "License"); you may not use this file except in compliance >>>>>> * with the License. You may obtain a copy of the License at >>>>>> * >>>>>> * = http://www.apache.org/licenses/LICENSE-2.0 >>>>>> * >>>>>> * Unless required by applicable law or agreed to in writing, >>>>>> software * distributed under the License is distributed on an >>>>>> "AS IS" BASIS, * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, >>>>>> either express or implied. * See the License for the specific >>>>>> language governing permissions and * limitations under the >>>>>> License. */ >>>>>>=20 >>>>>> package com.sun.jini.thread; >>>>>>=20 >>>>>> import com.sun.jini.thread.TaskManager.Task; >>>>>> import java.util.List; >>>>>> import java.util.concurrent.AbstractExecutorService; >>>>>> import java.util.concurrent.ExecutorService; >>>>>> import java.util.concurrent.RejectedExecutionException; >>>>>> import java.util.concurrent.TimeUnit; >>>>>>=20 >>>>>> /** >>>>>> * >>>>>> * @author peter >>>>>> */ >>>>>> public class TaskManagerWrapper extends AbstractExecutorService >>>>>> implements ExecutorService { >>>>>>=20 >>>>>> private final TaskManager tm; >>>>>> private final PosionPill pill; >>>>>> private volatile boolean isShutdown; >>>>>>=20 >>>>>> public TaskManagerWrapper(TaskManager manager){ >>>>>> tm =3D manager; >>>>>> isShutdown =3D false; >>>>>> pill =3D new PosionPill(manager); >>>>>> } >>>>>>=20 >>>>>> @Override >>>>>> public void shutdown() { >>>>>> isShutdown =3D true; >>>>>> tm.add(pill); >>>>>> } >>>>>>=20 >>>>>> @Override >>>>>> public List shutdownNow() { >>>>>> isShutdown =3D true; >>>>>> tm.terminate(); >>>>>> return tm.getPending(); >>>>>> } >>>>>>=20 >>>>>> @Override >>>>>> public boolean isShutdown() { >>>>>> return isShutdown; >>>>>> } >>>>>>=20 >>>>>> @Override >>>>>> public boolean isTerminated() { >>>>>> return isShutdown; >>>>>> } >>>>>>=20 >>>>>> @Override >>>>>> public boolean awaitTermination(long timeout, TimeUnit unit) >>>>>> throws InterruptedException { long start =3D >>>>>> System.currentTimeMillis(); long duration =3D >>>>>> unit.toMillis(timeout); synchronized (pill){ >>>>>> while (!pill.terminated){ >>>>>> wait(duration); >>>>>> if (pill.terminated) return true; >>>>>> long elapsed =3D System.currentTimeMillis() - start; >>>>>> if (elapsed >=3D duration) return false; >>>>>> duration =3D duration - elapsed; >>>>>> } >>>>>> } >>>>>> return true; // pill was terminated. >>>>>> } >>>>>>=20 >>>>>> @Override >>>>>> public void execute(Runnable command) { >>>>>> if (isShutdown) throw new RejectedExecutionException("TaskManager >>>>>> terminated"); } >>>>>>=20 >>>>>> private static class PosionPill implements Task { >>>>>>=20 >>>>>> private final TaskManager tm; >>>>>> boolean terminated; >>>>>>=20 >>>>>> PosionPill(TaskManager tm){ >>>>>> this.tm =3D tm; >>>>>> terminated =3D false; >>>>>> } >>>>>>=20 >>>>>> @Override >>>>>> public boolean runAfter(List tasks, int size) { >>>>>> if (!tasks.isEmpty()) return true; // Make sure we always run >>>>>> last. return false; >>>>>> } >>>>>>=20 >>>>>> @Override >>>>>> public void run() { >>>>>> tm.terminate(); // Bye. >>>>>> synchronized (this){ >>>>>> terminated =3D true; >>>>>> notifyAll(); >>>>>> } >>>>>> } >>>>>>=20 >>>>>> } >>>>>>=20 >>>>>> } >>>>>=20 >>>>=20 >>>=20 >>=20 >=20