lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Neil Pitman <>
Subject Re: Breaking up words with a certain pattern and search by parts
Date Mon, 08 Apr 2002 01:22:47 GMT
Hi Sheldon,

It was my understanding that you should parse the input text yourself 
(since you understand the deeper semantics).  When you see "MEM12345" 
you can add {"MEM12345", "MEM" and "12345"} into the words to index. 
This is similar to converting the words to lowercase or stripping 
accents when analyzing a text.

If you wanted to get even more fancy you could create one or more 
separate fields for the product codes or fragments of product code. 
Your GUI could use them to search on.

Does this make sense?
Sheldon Shi wrote:

> In my project I would like to search for product code such as
> MEM12345 either by "MEM" or by "12345". I can't do that right
> now in Lucene 1.2. Prefix query doesn't do prefix search followed
> by numbers, and there is no "end with" type of search. How do I 
> modify the HTMLParser to index MEM12345 as two words MEM and 12345 
> instead of one?
> Thanks.
> Sheldon
> --
> To unsubscribe, e-mail:   <>
> For additional commands, e-mail: <>

Neil Pitman

To unsubscribe, e-mail:   <>
For additional commands, e-mail: <>

View raw message