lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hasan Diwan <hasan.di...@gmail.com>
Subject Re: Email Indexing
Date Thu, 28 Oct 2010 06:05:17 GMT
On 27 October 2010 18:16, Troy Wical <troy@wical.com> wrote:
> Depends on what your trying to index, I suppose. Maildir or mbox? For some time now,
off and on, I have been working to index an ezmlm mailing list archive. In the end, I went
with Swish-E and have made quite a bit of progress. I am short of my complete goal though.
The issue is that the search results do not return results that contain the subject, and there
is currently no excerpt or phrase highlighting. My problem is the flat text email files I
am working with have no xml or anything to help the indexer create fields from. I've not yet
figured out how to convert the emails to xml.

Neither Maildir or mbox -- IMAP/POP doesn't care. Basically, I want to
build the index based on the contents of (my) gmail box. I can
retrieve the messages using IMAP, just need to figure out the
structure of the index.

Converting email to XML? Email me off-list and I'll provide you with
some help (as email => XML has little to do with lucene).
-- 
Sent from my mobile device
Envoyait de mon telephone mobil

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message