lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steven Rowe (JIRA)" <>
Subject [jira] Commented: (LUCENE-1380) Patch for ShingleFilter.enablePositions (or PositionFilter)
Date Tue, 23 Sep 2008 14:55:44 GMT


Steven Rowe commented on LUCENE-1380:

A couple of comments on the PositionFilter patch:

# The javadocs should be more explicit, e.g. about the fact that positionIncrement defaults
to zero
# I think there ought to be a constructor that takes in a positionIncrement, perhaps instead
of the setter.
# You don't handle the case where the filter is used for more than one document; there should
be an else clause that resets firstTokenPositioned to false after this block:
if(null != reusableToken){
    firstTokenPositioned = true;
# You should provide a standalone test for the PositionFilter, in addition to the ShingleFilterTest

> Patch for ShingleFilter.enablePositions (or PositionFilter)
> -----------------------------------------------------------
>                 Key: LUCENE-1380
>                 URL:
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: contrib/analyzers
>            Reporter: Mck SembWever
>            Priority: Trivial
>         Attachments: LUCENE-1380-PositionFilter.patch, LUCENE-1380.patch, LUCENE-1380.patch
> Make it possible for *all* words and shingles to be placed at the same position, that
is for _all_ shingles (and unigrams if included) to be treated as synonyms of each other.
> Today the shingles generated are synonyms only to the first term in the shingle.
> For example the query "abcd efgh ijkl" results in:
>    ("abcd" "abcd efgh" "abcd efgh ijkl") ("efgh" efgh ijkl") ("ijkl")
> where "abcd efgh" and "abcd efgh ijkl" are synonyms of "abcd", and "efgh ijkl" is a synonym
of "efgh".
> There exists no way today to alter which token a particular shingle is a synonym for.
> This patch takes the first step in making it possible to make all shingles (and unigrams
if included) synonyms of each other.
> See for mailing list thread.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message