lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tatu Saloranta <t...@hypermall.net>
Subject Re: Wildcard workaround
Date Fri, 30 May 2003 00:57:57 GMT
On Wednesday 28 May 2003 05:43, David Medinets wrote:
> ----- Original Message -----
> From: "Andrei Melis" <Andrei.Melis@snt.ro>
>
> > As far as I have understood, lucene does not allow search queries
> > starting with wildcards. I have a file database indexed by content
> > and also by filename. It would be nice if the user could perform a
> > usual search like "*.ext".
>
> Does anyone know if Oracle patented the technique that they use for *ext
> searching in the Oracle Text product. If not, I'm sure the technique can be
> borrowed.
>
> On the other hand, the slow technique of comparing each term to *.ext can
> certainly be implemented with a minimum of effort, I think.

[apologies if somebody else already pointed this out... I missed some mails to 
the list from yesterday]

One of the most interesting solutions somebody posted earlier, was to use
2 indexes; one for 'normal' searches, with normal analyzer etc, and second
one that uses reversed words; ie. analyzer reverses words tokenized by
standard analyzer. This second index would then allow for searches
to do prefix match, in this case query would be something like

reverse_field:txe.*

This would work efficiently, although pretty much double the size of index for 
content that has to be prefix-searchable. Still, this solution somehow 
appeals to my hacker side. :-)

In this specific case, though, what others have suggested (add file prefix as 
separate field to search on), is probably more practical.

-+ Tatu +-


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message