lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jack Krupansky" <j...@basetechnology.com>
Subject Re: NGramPhraseQuery with missing terms
Date Wed, 19 Dec 2012 14:05:22 GMT
"a BooleanQuery, but it requires me to consider every possible pair of terms 
(since any one of the terms could be missing)"

What about setting minMatch and all the terms as "SHOULD" - and then 
minMatch could be tuned for how many missing terms to tolerate?

See:
http://lucene.apache.org/core/4_0_0/core/org/apache/lucene/search/BooleanQuery.html#setMinimumNumberShouldMatch(int)

-- Jack Krupansky

-----Original Message----- 
From: 김한규
Sent: Wednesday, December 19, 2012 2:36 AM
To: java-user@lucene.apache.org
Subject: NGramPhraseQuery with missing terms

Hi.

I am trying to make a NGramPhrase query that could tolerate terms missing,
so even if one of the NGrams doesn't match it still gets picked up by
search.
I know I could use the combination of normal SpanNearQuery and a
BooleanQuery, but it requires me to consider every possible pair of terms
(since any one of the terms could be missing) and it gets too messy and
expensive.

What I want to try is to use SpanTermQuery to get the positions of the
mathcing NGrams and list the spans' position informations in an order, so
that I could pick up any two or more spans near each other to score them
accordingly, but I can't figure out how can I combine the spans.

Any help in solving this issue is appreciated. Also, if there is an example
of a simple scoring implementation example that combines multiple queries'
results, it would be very nice. 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message