lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ashley Collins" <acoll...@hotmail.com>
Subject Indexing email messages?
Date Fri, 06 Dec 2002 10:12:24 GMT

I'm using Lucene to index MIME messages and have a couple of questions.

1) What is the best way to handle keyword fields which are repeated? Like 
"recipient" for example.

At the moment I have a for loop doing

        document.add(Field.Keyword("recipient", address));

But this seems to limit query results to messages that were sent to only the 
person I'm searching for...


Or, should I use Field.Text instead and write a custom analyzer which 
doesn't split email addresses. Then, store one field "recipients" which is a 
whitespace separated list of all the recipients?


2) I also store the sender in a keyword field, but searching isn't 
consistent. I can find some addresses, but not others. Where should I start 
looking for information to help with debugging?


3) Also, how do I make sure query terms that are for untokenized keyword 
fields don't get tokenized by QueryParse.parse()? I tried using the 
WhiteSpaceAnalyzer, but searching was still inconsistent.


Thanks in advance!
Ashley Collins





_________________________________________________________________
Add photos to your messages with MSN 8. Get 2 months FREE*. 
http://join.msn.com/?page=features/featuredemail


--
To unsubscribe, e-mail:   <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>


Mime
View raw message