lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From markharw00d <>
Subject Re: regex-based query contribution
Date Thu, 13 Oct 2005 18:15:07 GMT
Sounds like a very useful addition but as yet another variant of "term 
expanding" queries (fuzzy/prefix/range/wildcard) now might be a good 
time to re-raise the scoring issue I originally identified here with all 
such queries:

The issue is that "automagically" expanded terms are rewritten to a 
standard boolean query and because of the default IDF factor behaviour, 
rarer (often misspelt) terms are favoured over more common ones.
I don't imagine this is desirable behaviour for anyone.

I did provide an implementation that addressed this by ensuring all 
generated terms in the boolean query used the same IDF. The search 
results I posted showed a clear improvement. Unfortunately this was not 
rolled into core, however the other auto-expanding issue I raised on 
this JIRA bug to do with coords was addressed by adding disableCoord to 
Since this time a lot of work has gone into BooleanQuery scoring, not 
all of it committed, so I'm not sure how best to address this concern or 
what code to extend/modify.
Anyone (Paul?) have any suggestions?


To help you stay safe and secure online, we've developed the all new Yahoo! Security Centre.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message