lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Muir (JIRA)" <>
Subject [jira] Commented: (LUCENE-2943) ICU collator thread-safety issues
Date Mon, 28 Feb 2011 21:44:36 GMT


Robert Muir commented on LUCENE-2943:

Uwe, i can agree it looks a little wrong, but it makes the 'reusable' case easier.

the example you gave is the slow non-reusable case... honestly i'm not very worried about
making this slower... its already slow.

if we are to put responsibility on the user to pass Collator clones to each TokenFilter, it
will make reusing more difficult (e.g. custom analyzers). 

Again, the big trap is that usually you see "WARNING THIS CLASS IS NOT THREAD SAFE" but the
icu javadoc doesn't really say that, you have to instead read this general architecture document...
so I think by pushing the responsibility to the user, there would be lots of bugs (if anyone
makes a custom analyzer/factory/etc).

> ICU collator thread-safety issues
> ---------------------------------
>                 Key: LUCENE-2943
>                 URL:
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Analysis
>            Reporter: Robert Muir
>             Fix For: 3.1, 4.0
>         Attachments: LUCENE-2943.patch
> The ICU Collators (unlike the JDK ones) aren't thread safe:
, a little non-obvious since its not mentioned
> in the javadocs, and its not clear if the docs apply to only the C code, but i looked
> at the source and there is all kinds of internal state.
> So in my opinion, we should clone the icu collators (which are passed in from the outside)

> when creating a new TokenStream/AttributeImpl to prevent problems. This shouldn't be
a big
> deal since everything uses reusableTokenStream anyway.

This message is automatically generated by JIRA.
For more information on JIRA, see:


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message