lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tom.Hic...@USPTO.GOV
Subject RE: Wildcard workaround
Date Wed, 04 Jun 2003 17:04:15 GMT
I think BRS was probably using the reversed index trick long before Oracle,
so hopefully Oracle has no rights to it.  I'm not sure it meets the criteria
for non-obvious anyway.  We use BRS here for trademark searching and make
extensive use of the reversed index capability for left-hand wild card
searches.

Tom

-----Original Message-----
From: tatu@hypermall.net [mailto:tatu@hypermall.net]
Sent: Thursday, May 29, 2003 8:58 PM
To: lucene-user@jakarta.apache.org
Subject: Re: Wildcard workaround


On Wednesday 28 May 2003 05:43, David Medinets wrote:
> ----- Original Message -----
> From: "Andrei Melis" <Andrei.Melis@snt.ro>
>
> > As far as I have understood, lucene does not allow search queries
> > starting with wildcards. I have a file database indexed by content
> > and also by filename. It would be nice if the user could perform a
> > usual search like "*.ext".
>
> Does anyone know if Oracle patented the technique that they use for *ext
> searching in the Oracle Text product. If not, I'm sure the technique can
be
> borrowed.
>
> On the other hand, the slow technique of comparing each term to *.ext can
> certainly be implemented with a minimum of effort, I think.

[apologies if somebody else already pointed this out... I missed some mails
to 
the list from yesterday]

One of the most interesting solutions somebody posted earlier, was to use
2 indexes; one for 'normal' searches, with normal analyzer etc, and second
one that uses reversed words; ie. analyzer reverses words tokenized by
standard analyzer. This second index would then allow for searches
to do prefix match, in this case query would be something like

reverse_field:txe.*

This would work efficiently, although pretty much double the size of index
for 
content that has to be prefix-searchable. Still, this solution somehow 
appeals to my hacker side. :-)

In this specific case, though, what others have suggested (add file prefix
as 
separate field to search on), is probably more practical.

-+ Tatu +-


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message