lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Justin <>
Subject Re: InverseWildcardQuery
Date Fri, 30 Jul 2010 18:05:55 GMT
> an example

PerFieldAnalyzerWrapper analyzers =
    new PerFieldAnalyzerWrapper(new KeywordAnalyzer());
// myfield defaults to KeywordAnalyzer
analyzers.addAnalyzer("content", new SnowballAnalyzer(luceneVersion, 
// analyzers affects the indexed field value
IndexWriter writer = new IndexWriter(dir, analyzers, true, mfl);
// analyzers affects the parsed query string
QueryParser parser = new QueryParser(luceneVersion, "myfield", analyzers);
Query query = parser.parse("*:* AND -myfield:\"*foo*\"");
// What about an Analyzer to match field value to the query at search time?
ScoreDoc[] docs =, null, 1000).scoreDocs;

> An inverse query would require rewriting, too, I think.

Why would implementing a new Query class requires document changes in the index.

> Can you turn those prefixes into field names

No, the prefixes are not discrete. Multiple field values could start with the 
same prefix.

Writing something like InverseWildcardQuery seems like the most appropriate 
solution. My thought to have another Analyzer used on the field value at search 
time may be crazy, I don't know.

----- Original Message ----
From: Steven A Rowe <>
To: "" <>
Sent: Fri, July 30, 2010 12:04:58 PM
Subject: RE: InverseWildcardQuery

Hi Justin,

> Unfortunately the suffix requires a wildcard as well in our case. There
> are a limited number of prefixes though (10ish), so perhaps we could
> combine them all into one query. We'd still need some sort of
> InverseWildcardQuery implementation.
> > use another analyzer so you don't need wildcards
> I know analyzers can be used with IndexWriter and with QueryParser. Is
> there somewhere an analyzer could be used to alter the field to match the
> query at search time instead of altering the query to match the field?

Can you give an example of what you mean?

> Our current path to solving our problem requires additional fields which
> need rewritten causing a much larger performance degredation. One of the
> two paths above would be much more desirable.

An inverse query would require rewriting, too, I think.

You say you have 10-ish prefixes.  Can you turn those prefixes into field names, 
and index a token like EMPTY when there are no values for a particular prefix?  
Then your query would be (F1:EMPTY OR F2:EMPTY ... OR F10:EMPTY).



To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message