lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler (JIRA)" <>
Subject [jira] Updated: (LUCENE-1391) Token type and flags values get lost when using ShingleMatrixFilter
Date Fri, 04 Feb 2011 18:18:30 GMT


Uwe Schindler updated LUCENE-1391:

    Attachment: LUCENE-1391.patch

Here is just a funny rewrite of this filter, not 100% working (but tests pass). Problems occur,
when you define your own matrix, but the AttributeSources representing the Tokens are not
compatible with copyTo() on the actual TokenStream (e.g. use different AttributeFactory, have
additional attributes,...).

Also the filter was not yet optimized. Currently it always adds all 6 basic attributes.

To get around the TokenType problem, we can add the setter method to explicitely set the type
for shingles (currently its the class name).

The FlagsAttribute is itsself used by the Filter to manage internal Token state. It should
be replaced by a filter-internal ShingleMatrixStateAttribute containing an enum.

> Token type and flags values get lost when using ShingleMatrixFilter
> -------------------------------------------------------------------
>                 Key: LUCENE-1391
>                 URL:
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: contrib/analyzers
>    Affects Versions: 2.4, 2.9, 3.0
>            Reporter: Wouter Heijke
>            Assignee: Uwe Schindler
>             Fix For: 3.1, 4.0
>         Attachments: LUCENE-1391.patch
> While using the new ShingleMatrixFilter I noticed that a token's type and flags get lost
while using this filter. ShingleFilter does respect these values like the other filters I

This message is automatically generated by JIRA.
For more information on JIRA, see:


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message