lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steven Rowe (JIRA)" <j...@apache.org>
Subject [jira] Commented: (LUCENE-1380) Patch for ShingleFilter.enablePositions (or PositionFilter)
Date Tue, 23 Sep 2008 14:55:44 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-1380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12633756#action_12633756
] 

Steven Rowe commented on LUCENE-1380:
-------------------------------------

A couple of comments on the PositionFilter patch:

# The javadocs should be more explicit, e.g. about the fact that positionIncrement defaults
to zero
# I think there ought to be a constructor that takes in a positionIncrement, perhaps instead
of the setter.
# You don't handle the case where the filter is used for more than one document; there should
be an else clause that resets firstTokenPositioned to false after this block:
{code:java}
if(null != reusableToken){
  if(firstTokenPositioned){
    reusableToken.setPositionIncrement(positionIncrement);
  }else{
    firstTokenPositioned = true;
  }
}
{code}
# You should provide a standalone test for the PositionFilter, in addition to the ShingleFilterTest
tests.

> Patch for ShingleFilter.enablePositions (or PositionFilter)
> -----------------------------------------------------------
>
>                 Key: LUCENE-1380
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1380
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: contrib/analyzers
>            Reporter: Mck SembWever
>            Priority: Trivial
>         Attachments: LUCENE-1380-PositionFilter.patch, LUCENE-1380.patch, LUCENE-1380.patch
>
>
> Make it possible for *all* words and shingles to be placed at the same position, that
is for _all_ shingles (and unigrams if included) to be treated as synonyms of each other.
> Today the shingles generated are synonyms only to the first term in the shingle.
> For example the query "abcd efgh ijkl" results in:
>    ("abcd" "abcd efgh" "abcd efgh ijkl") ("efgh" efgh ijkl") ("ijkl")
> where "abcd efgh" and "abcd efgh ijkl" are synonyms of "abcd", and "efgh ijkl" is a synonym
of "efgh".
> There exists no way today to alter which token a particular shingle is a synonym for.
> This patch takes the first step in making it possible to make all shingles (and unigrams
if included) synonyms of each other.
> See http://comments.gmane.org/gmane.comp.jakarta.lucene.user/34746 for mailing list thread.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message