lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael D. Curtin" <>
Subject Re: Wildcard
Date Fri, 02 Dec 2005 23:55:44 GMT
John Powers wrote:

> Hello,
> Lucene only lets you use a wildcard after a term, not before, correct?
> What work arounds are there for that?
> If I have an item 108585-123
> And another 332323-123
> How can I look for all the -123 family of items?

Classic indexing problem.  Here are a couple simple ideas.

1.  If the dash in these items means that the two sets of numbers are, 
in fact, different attributes, AND you only want to prefix entire 
attributes, then separate them into their own fields in Lucene.  That 
is, instead of a FOOBAR field with entries like 108585-123, have a FOO 
field with entries like 108585 and 332323, and a BAR field with entries 
like 123.  To look for all items in the 123 family, simply search only 
on the BAR field.

2.  If you want arbitrary postfixing for ITEM (in addition to the 
arbitrary prefixing Lucene already provides), encode ITEM in the index 
in forward and reverse directions.  In other words, ITEM has entries 
like 108585-123 and 332323-123, and REVITEM has entries like 321-585801 
and 321-332323.  To look for all items in the 123 family,  search for 
"321*" on REVITEM.  To look for all items in a family ending in 3, the 
equivalent of "*3" in ITEM, search for "3*" in REVITEM.

Good luck!


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message