Erik Hatcher wrote: >> I think this is historical. Most of the other query classes are ones >> I've implemented, and I've added relevant methods to Similarity for >> them. WildcardQuery and FuzzyQuery were contributed. I've never used >> them in an application, as I think they're potential performance >> pitfalls, so I've probably ignored them when maintaining Similarity. > > > Yeah, I'm digging deep into these implementations and see the > performance pitfall too. Its kind of scary, actually. With a wide open > QueryParser, it could be a potential DoS attack to force a ton of fuzzy > or wildcard queries through. It's almost as if we should force the > enabling of those features in QueryParser and have it off by default. > Just an idea. Note that BooleanQuery.maxClauseCount was added to keep these sorts of queries from blowing things up too much. But, still, if every query contains a generous wildcard, or was fuzzy at all, things can quickly grind to a halt. This mainly addresses out-of-memory issues. A fuzzy query, even if it doesn't match many terms, can still be very slow, consuming large amounts of i/o and CPU. As a start, we could add parameters to the query parser to disable these features. If we disable them by default then lots of people will howl: they're popular features. Perhaps in a future major release they could be disabled by default, but, for now, I think it would at least be good to be able to disable them. Doug