jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "fabrizio giustina (JIRA)" <j...@apache.org>
Subject [jira] Commented: (JCR-2622) Configured index analizer doesn't really work in 2.1.0?
Date Sat, 01 Jan 2011 17:39:45 GMT

    [ https://issues.apache.org/jira/browse/JCR-2622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12976424#action_12976424
] 

fabrizio giustina commented on JCR-2622:
----------------------------------------

I tried to trace back the change that broke index analizers in jackrabbit > 2.0.x and it
turned out to be the optimization from JCR-2505 (rev. 915718).

The  patch in JCR-2505 added a "reusableTokenStream" method in org.apache.jackrabbit.core.query.lucene.JackrabbitAnalyzer
which seemed to speed up tests, but it actually breaks them :/
Not sure if such optimization should be either needed, since the abstract base org.apache.lucene.analysis.Analyzer
class already implements reusableTokenStream as an alias for the default tokenStream method.

The attached patch fixes the bug in the trunk release by simply removing the "optimization".
The patch also contains a testcase that shows the issue.

Please, can anybody commit the patch to trunk and to the 2.1/2.2 branches?


> Configured index analizer doesn't really work in 2.1.0?
> -------------------------------------------------------
>
>                 Key: JCR-2622
>                 URL: https://issues.apache.org/jira/browse/JCR-2622
>             Project: Jackrabbit Content Repository
>          Issue Type: Bug
>    Affects Versions: 2.1.0, 2.1.1, 2.1.2, 2.1.3, 2.2.0
>            Reporter: fabrizio giustina
>            Priority: Critical
>         Attachments: JCR-2622-tests_and_patch.diff
>
>
> I just tried migrating an existing project which was using jackrabbit 2.0.0 to 2.1.0.
> We have an index analyzer configured which filters accented chars: 
> {code}
> public class ItalianSnowballAnalyzer extends StandardAnalyzer
> {
>     @Override
>     public TokenStream tokenStream(String fieldName, Reader reader)
>     {
>         return new ISOLatin1AccentFilter(new LowerCaseFilter((super.tokenStream(fieldName,
reader))));
>     }
> }
> {code}
> The project has a good number of unit tests, an xml is loaded in a memory-only jackrabbit
repository and several queries are checked against expected results.
> After migrating to 2.1.0 none of the tests that relied on the Index analizer work anymore,
for example searching for "test" doesn't find anymore nodes containing "t├Ęst".
> Upgrading to jackrabbit 2.1.0 is the only change done (no changes in the configuration/code
or other libraries at all). Rolling back to the 2.0.0 dependency is enough to make all the
tests working again.
> I've checked the changes in 2.1 but I couldn't find any apparently related change. Also
note that I was already using the patch in JCR-2504 also before (configuration loading works
fine in the unpatched 2.1). Another point is that the configured IndexAnalyzer still gets
actually called during our tests (checked in debug mode).
> Any idea?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message