lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ashley Collins" <>
Subject Indexing email messages?
Date Fri, 06 Dec 2002 10:12:24 GMT

I'm using Lucene to index MIME messages and have a couple of questions.

1) What is the best way to handle keyword fields which are repeated? Like 
"recipient" for example.

At the moment I have a for loop doing

        document.add(Field.Keyword("recipient", address));

But this seems to limit query results to messages that were sent to only the 
person I'm searching for...

Or, should I use Field.Text instead and write a custom analyzer which 
doesn't split email addresses. Then, store one field "recipients" which is a 
whitespace separated list of all the recipients?

2) I also store the sender in a keyword field, but searching isn't 
consistent. I can find some addresses, but not others. Where should I start 
looking for information to help with debugging?

3) Also, how do I make sure query terms that are for untokenized keyword 
fields don't get tokenized by QueryParse.parse()? I tried using the 
WhiteSpaceAnalyzer, but searching was still inconsistent.

Thanks in advance!
Ashley Collins

Add photos to your messages with MSN 8. Get 2 months FREE*.

To unsubscribe, e-mail:   <>
For additional commands, e-mail: <>

View raw message