Return-Path: Delivered-To: apmail-incubator-uima-user-archive@locus.apache.org Received: (qmail 301 invoked from network); 15 Aug 2007 06:09:01 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 15 Aug 2007 06:09:01 -0000 Received: (qmail 27261 invoked by uid 500); 15 Aug 2007 06:08:58 -0000 Delivered-To: apmail-incubator-uima-user-archive@incubator.apache.org Received: (qmail 27244 invoked by uid 500); 15 Aug 2007 06:08:58 -0000 Mailing-List: contact uima-user-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: uima-user@incubator.apache.org Delivered-To: mailing list uima-user@incubator.apache.org Received: (qmail 27235 invoked by uid 99); 15 Aug 2007 06:08:58 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 14 Aug 2007 23:08:58 -0700 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [204.127.192.81] (HELO rwcrmhc11.comcast.net) (204.127.192.81) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 15 Aug 2007 06:08:56 +0000 Received: from rmailcenter90.comcast.net ([204.127.197.190]) by comcast.net (rwcrmhc11) with SMTP id <20070815060834m110012vg3e>; Wed, 15 Aug 2007 06:08:34 +0000 Received: from [204.50.70.28] by rmailcenter90.comcast.net; Wed, 15 Aug 2007 06:08:34 +0000 From: holmberg2066@comcast.net (greg@holmberg.name) To: uima-user@incubator.apache.org Subject: CasPools in a general service Date: Wed, 15 Aug 2007 06:08:34 +0000 Message-Id: <081520070608.25434.46C298620007A92E0000635A2206824693C0C0CFCD099D0A0D03040108@comcast.net> X-Mailer: AT&T Message Center Version 1 (Oct 4 2006) X-Authenticated-Sender: aG9sbWJlcmcyMDY2QGNvbWNhc3QubmV0 X-Virus-Checked: Checked by ClamAV on apache.org I'm wondering how best to use CasPools in my system. My system is a general service--that is, it may receive concurrent requests with arbitrary AnalysisEngineDescriptions from different applications. Some requests may use the exact same AED object, in which case, it's obvious that documents processed from those requests can share the same AnalysisEngine and CasPool. However, it's also likely that the system will receive AEDs that are equivalent--that is, they have the same annotators with the same configuration parameters, but they are two physically different AED objects. Now, in this case, it would be possible to use the same AE and CasPool, if there was a way to tell that they were equivalent. Unfortunately, the equals() methods on AnalysisEngineDescription and AnalysisEngine won't tell me this. So what I currently do is create separate CasPools. Is it worth it for performance and memory usage to write a method to compare two AEDs to determine if they are equivalent? Or is creating CAS's and CasPool not expensive enough to justify the work, and I should just continue with separate CasPools? Going further, it appears that two AnalysisEngines could share the same CasPool if only their type systems are the same--the AE's themselves don't event have to be the same (could have different configuration parameter values, for example). They merely need the same CasDefinition. Is there an easy way to determine if two AEDs have the same type system or CasDefinition, and so could share a CasPool? Thanks, Greg Holmberg