lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Paul Smith" <>
Subject Re: Hungarian notation analyzer and phrase queries
Date Thu, 14 Apr 2005 06:45:52 GMT
Thanks for your help guys!

If you put the term query at position 2 then you need slop to find "Use
PowerQuery for advanced searches", which is the exact text in the
document.  I think I'd rather have that phrase query work without any
slop, and require some slop for  "use power query for advanced
searches", because it doesn't really match the document exactly anyway.
Putting powerquery, power, and query at the same position has this
behavior. But in this approach, queries like  "Use query for advanced
searches" also match.

On the other hand, if we put query at the next position (position 2 in
this example), then "Use query for advanced searches" does not match

Either of these approaches seem to work fine with non-phrase queries.
The approach that always replaces power query with powerquery has
problems in some of these areas (like wildcard and fuzzy searches).

So it sounds like there isn't a perfect solution, but I think the best
tradeoff for me is to put them all in the same position.... unless
anyone has more input on the subject?


>>> 04/13/05 11:36AM >>>

: Another approach would be to index this as:
: token:       use   power      query for advanced searches
:                     powerquery
: position:    0     1          2     3   4        5
: Then use phrase queries with slop=1, to permit a one-token gap when
: someone searches for "use powerquery for advanced searches".

right, but in your example "1" is a magic number that works because we
only dealing with "multi-word synonyms" of 2 words.  in general, this
approach requires that you pick some some N such that you are garunteed
synonym contains more then N-1 words, and set the token positions

 token:       use   powerquery         for  advanced  searches
                    power      query
 position:    0     N          N+1     2N   3N        4N


To unsubscribe, e-mail: 
For additional commands, e-mail:

This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom they
are addressed. If you have received this email in error please notify
the TenFold Postmaster (

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message