lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (LUCENE-3042) AttributeSource can have an invalid computed state
Date Fri, 22 Apr 2011 21:45:06 GMT

     [ https://issues.apache.org/jira/browse/LUCENE-3042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Uwe Schindler updated LUCENE-3042:
----------------------------------

          Component/s: Analysis
             Priority: Critical  (was: Major)
    Affects Version/s: 4.0
                       3.2
                       2.9.4
                       3.0.3
                       3.1
        Fix Version/s: 4.0
                       3.2

Just to conclude:
This bug is not so serious as it appears (else someone would have noticed before), as it would
never happen on 0-8-15 TokenStreams, when used like IndexWriter does.
This bug only appears if you have TokenFilters and you add Attributes on the top level Filter
later (after using the TokenStream for first time). Using the TokenStream means that you calculate
the states and so every Filter/Tokenizer got his own cached state. Adding them a new Attribute
on the last filter will never invalidate the cache of the Tokenizer.

This bug could affect:
- Analyzers that reuse TokenStreams partly and plug filters on top in the reuseableTokenStream()
method, reusing the partially cached tokenstream. Like those, that always add a non-cacheable
TokenFilter on top of a base TS.
- TokenStreams that add attributes on the-fly in one of their filters.

We should backport this patch to 3.x, 3.1.1 and maybe even 2.9.x and 3.0.x branches (if somebody
wants to patch 3.0). In general this is a serious issue of the new TokenStream API since 2.9.


> AttributeSource can have an invalid computed state
> --------------------------------------------------
>
>                 Key: LUCENE-3042
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3042
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Analysis
>    Affects Versions: 2.9.4, 3.0.3, 3.1, 3.2, 4.0
>            Reporter: Robert Muir
>            Assignee: Uwe Schindler
>            Priority: Critical
>             Fix For: 3.2, 4.0
>
>         Attachments: LUCENE-3042.patch, LUCENE-3042.patch
>
>
> If you work a tokenstream, consume it, then reuse it and add an attribute to it, the
computed state is wrong.
> thus for example, clearAttributes() will not actually clear the attribute added.
> So in some situations, addAttribute is not actually clearing the computed state when
it should.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message