lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kay Kay <kaykay.uni...@gmail.com>
Subject Re: Searching .msg files
Date Tue, 15 Dec 2009 03:02:10 GMT
I remember seeing a similar thread in the lucene user mailing list. You 
can check the archives of the same.

As regarding the strategies - there could be 2 of them .

* you can create an index per user and store the email content involving 
the user in the same and use it for search.
(or)

* you can have 1 gigantic index , and have the To/Cc names as fields in 
them and all searches by a given user would go through an initial 
filter-pass on this index.

solr can of course, index a variety of content (see tika project ) and 
not restricted to xml at all.

You would need to weight the pros / cons of each of them depending on 
the corpus of data you are talking about and usage / performance 
expectations of the search.
Once you identify the strategy as appropriate  - you can define the solr 
schema for the fields and use the same.




Abhishek Srivastava wrote:
> Hello Everyone,
>
> In my company, we store a lot of old emails (.msg files) in a database (done
> for the purpose of legal compliance).
>
> The users have been asking us to give search functionality on the old
> emails.
>
> One of the primary requirement is that when people search, they should only
> be able to search in their own emails (emails in which they were in the to,
> cc or bcc list).
>
> How can solr be used?
>
> from what I know about this product is that it only searches xml content...
> so I will have to extract the body of the email and convert it to xml right?
>
> How will I limit the search results to only those emails where the user who
> is searching was in the to, cc or bcc list?
>
> Please do recommend me an approach for providing a solution to our
> requirement.
>
>   


Mime
View raw message