lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler (JIRA)" <j...@apache.org>
Subject [jira] Issue Comment Edited: (LUCENE-2943) ICU collator thread-safety issues
Date Mon, 28 Feb 2011 21:08:37 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-2943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13000511#comment-13000511
] 

Uwe Schindler edited comment on LUCENE-2943 at 2/28/11 9:07 PM:
----------------------------------------------------------------

I changed my mind a little bit:

The cloning of the Collator should be done in the Analyzer not in the Filter. The same applies
to the AttributeImpl, the cloning should not be done in the ctor. The problem is not that
the TokenStream or the Attribute instance may reuse the attribute in different threads, the
problem is that the factory class (the Analyzer) does reuse the Collator in different threads
when it produces multiple tokenstreams or the AF multiple attributes.

This is a slight difference, because the following code is always safe:
new CollationFilter(Collator.newInstance(lang)), cloning would be wrong.

The reason for the whole thing: TokenStream and Attribute instances itsself are single-threaded
only, but not the factory or the analyzer.

      was (Author: thetaphi):
    I changed my mind a little bit:

The cloning of the Filter should be done in the Analyzer not in the Filter. The same applies
to the AttributeImpl, the cloning should be done in the ctor. The problem is not that the
TokenStream or the Attribute instance may reuse the attribute in different threads, the problem
is that the factory class (the Analyzer) does reuse the Collator in different threads when
it produces multiple tokenstreams or the AF multiple attributes.

This is a slight difference, because the following code is always safe:
new CollationFilter(Collator.newInstance(lang)), cloning would be wrong.

The reason for the whole thing: TokenStream and Attribute instances itsself are single-threaded
only, but not the factory or the analyzer.
  
> ICU collator thread-safety issues
> ---------------------------------
>
>                 Key: LUCENE-2943
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2943
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Analysis
>            Reporter: Robert Muir
>             Fix For: 3.1, 4.0
>
>         Attachments: LUCENE-2943.patch
>
>
> The ICU Collators (unlike the JDK ones) aren't thread safe: http://userguide.icu-project.org/collation/architecture
, a little non-obvious since its not mentioned
> in the javadocs, and its not clear if the docs apply to only the C code, but i looked
> at the source and there is all kinds of internal state.
> So in my opinion, we should clone the icu collators (which are passed in from the outside)

> when creating a new TokenStream/AttributeImpl to prevent problems. This shouldn't be
a big
> deal since everything uses reusableTokenStream anyway.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message