lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Justin <cry...@yahoo.com>
Subject Re: InverseWildcardQuery
Date Fri, 30 Jul 2010 15:54:04 GMT
> indexing your terms in reverse

Unfortunately the suffix requires a wildcard as well in our case. There are a 
limited number of prefixes though (10ish), so perhaps we could combine them all 
into one query. We'd still need some sort of InverseWildcardQuery 
implementation.

> use another analyzer so you don't need wildcards

I know analyzers can be used with IndexWriter and with QueryParser. Is there 
somewhere an analyzer could be used to alter the field to match the query at 
search time instead of altering the query to match the field?

Our current path to solving our problem requires additional fields which need 
rewritten causing a much larger performance degredation. One of the two paths 
above would be much more desirable.




----- Original Message ----
From: Uwe Schindler <uwe@thetaphi.de>
To: java-user@lucene.apache.org
Sent: Fri, July 30, 2010 10:41:13 AM
Subject: RE: InverseWildcardQuery

With all these requirements you slow down your queries immense. You should
think about indexing your terms different:

- if you need leading wildcards, think about indexing your terms in reverse!
Wildcards starting with * needs to iterate all terms, so it's very slow (and
because of this defaults to be disabled)
- wildcards and regexes should always be used sparingly, as it can happen
that the whole term dictionary needs to be checked. Maybe you should use
another analyzer, so you don't need wildcards.

Uwe

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de


> -----Original Message-----
> From: Ian Lea [mailto:ian.lea@gmail.com]
> Sent: Friday, July 30, 2010 5:33 PM
> To: java-user@lucene.apache.org
> Subject: Re: InverseWildcardQuery
> 
> > I think you're suggesting, for example, "*:* AND -myfield:foo*".
> 
> Yes, I think that is equivalent.
> 
> > If my document contains "myfield:foobar" and "myfield:dog", the
> > document would be thrown out because of the first field. I want to
> > keep the document because the second field does not match.
> 
> OK, too hard for me ...
> 
> > Related, is there a way to use wildcards to match the beginning of the
> field?
> >
> > org.apache.lucene.queryParser.ParseException: Cannot parse '*:* AND
> > -myfield:*foo*': '*' or '?' not allowed as first character in
> > WildcardQuery
> 
> Yes.  QueryParser.setAllowLeadingWildcard().
> 
> 
> --
> Ian.
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


      

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message