Return-Path: Delivered-To: apmail-lucene-dev-archive@www.apache.org Received: (qmail 25796 invoked from network); 28 Feb 2011 21:06:03 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 28 Feb 2011 21:06:03 -0000 Received: (qmail 92968 invoked by uid 500); 28 Feb 2011 21:06:02 -0000 Delivered-To: apmail-lucene-dev-archive@lucene.apache.org Received: (qmail 92653 invoked by uid 500); 28 Feb 2011 21:06:01 -0000 Mailing-List: contact dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@lucene.apache.org Delivered-To: mailing list dev@lucene.apache.org Received: (qmail 92628 invoked by uid 99); 28 Feb 2011 21:06:01 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 28 Feb 2011 21:06:01 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 28 Feb 2011 21:05:58 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 1AE0042A85 for ; Mon, 28 Feb 2011 21:05:37 +0000 (UTC) Date: Mon, 28 Feb 2011 21:05:37 +0000 (UTC) From: "Uwe Schindler (JIRA)" To: dev@lucene.apache.org Message-ID: <196953642.2845.1298927137106.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <1327055274.2803.1298925817275.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] Commented: (LUCENE-2943) ICU collator thread-safety issues MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/LUCENE-2943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13000511#comment-13000511 ] Uwe Schindler commented on LUCENE-2943: --------------------------------------- I changed my mind a little bit: The cloning of the Filter should be done in the Analyzer not in the Filter. The same applies to the AttributeImpl, the cloning should be done in the ctor. The problem is not that the TokenStream or the Attribute instance may reuse the attribute in different threads, the problem is that the factory class (the Analyzer) does reuse the Collator in different threads when it produces multiple tokenstreams or the AF multiple attributes. This is a slight difference, because the following code is always safe: new CollationFilter(Collator.newInstance(lang)), cloning would be wrong. The reason for the whole thing: TokenStream and Attribute instances itsself are single-threaded only, but not the factory or the analyzer. > ICU collator thread-safety issues > --------------------------------- > > Key: LUCENE-2943 > URL: https://issues.apache.org/jira/browse/LUCENE-2943 > Project: Lucene - Java > Issue Type: Bug > Components: Analysis > Reporter: Robert Muir > Fix For: 3.1, 4.0 > > Attachments: LUCENE-2943.patch > > > The ICU Collators (unlike the JDK ones) aren't thread safe: http://userguide.icu-project.org/collation/architecture , a little non-obvious since its not mentioned > in the javadocs, and its not clear if the docs apply to only the C code, but i looked > at the source and there is all kinds of internal state. > So in my opinion, we should clone the icu collators (which are passed in from the outside) > when creating a new TokenStream/AttributeImpl to prevent problems. This shouldn't be a big > deal since everything uses reusableTokenStream anyway. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional commands, e-mail: dev-help@lucene.apache.org