lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mck SembWever (JIRA)" <j...@apache.org>
Subject [jira] Updated: (LUCENE-1380) Patch for ShingleFilter.enablePositions
Date Tue, 23 Sep 2008 14:35:44 GMT

     [ https://issues.apache.org/jira/browse/LUCENE-1380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Mck SembWever updated LUCENE-1380:
----------------------------------

    Attachment: LUCENE-1380-PositionFilter.patch

> If you really want to do this change is this layer I suggest that you seperate out this
feature to a new filter that modify 
> the position increment.

Attaching alternative patch as suggested for PositionFilter and its test.
The first token always maintains its original positionIncrement, but subsequent tokens in
the TokenStream has their positionIncrement set to match the value of PositionFilter.positionIncrement

I still fail to understand why Karl and Steve would rather see this logic in the QueryParser.
The best explanation so far was from Steve:
> IMO, the correct layer to solve this is in Solr's QParser - 
> I think there should be a way to tell the parser not to parse, but rather to send the
whole query to be analyzed.

but i wouldn't be surprised if this goes against the grain of how Solr works.


> Patch for ShingleFilter.enablePositions
> ---------------------------------------
>
>                 Key: LUCENE-1380
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1380
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: contrib/analyzers
>            Reporter: Mck SembWever
>            Priority: Trivial
>         Attachments: LUCENE-1380-PositionFilter.patch, LUCENE-1380.patch, LUCENE-1380.patch
>
>
> Make it possible for *all* words and shingles to be placed at the same position, that
is for _all_ shingles (and unigrams if included) to be treated as synonyms of each other.
> Today the shingles generated are synonyms only to the first term in the shingle.
> For example the query "abcd efgh ijkl" results in:
>    ("abcd" "abcd efgh" "abcd efgh ijkl") ("efgh" efgh ijkl") ("ijkl")
> where "abcd efgh" and "abcd efgh ijkl" are synonyms of "abcd", and "efgh ijkl" is a synonym
of "efgh".
> There exists no way today to alter which token a particular shingle is a synonym for.
> This patch takes the first step in making it possible to make all shingles (and unigrams
if included) synonyms of each other.
> See http://comments.gmane.org/gmane.comp.jakarta.lucene.user/34746 for mailing list thread.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message