lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Hatcher <e...@ehatchersolutions.com>
Subject Re: one huge index or many small ones?
Date Thu, 04 Nov 2004 16:20:20 GMT
One index per e-mail is way overkill and probably not even feasible 
resource-wise.  Take advantage of fields in Lucene documents and use 
BooleanQuery to AND in other criteria for filtering, or use a Filter if 
the filtering criteria is relatively static.

	Erik

On Nov 4, 2004, at 11:00 AM, javier muguruza wrote:

> Hi,
>
> We are going to move from a just-in-time perl based search to using
> lucene in our project. I have to index emails (bodies and also
> attachements). I keep in the filesystem all the bodies and attachments
> for a long period of time. I have to find emails that fullfil certain
> conditions, some of the conditions are take care of at a different
> level, so in the end I have a SUBSET of emails I have to run through
> lucene.
>
> I was assuming that the best way would be to create an index for each
> email. Having an unique index for a group of emails (say a day worth
> of email) seems too coarse grained, imagine a day has 10000 emails,
> and some queries will like to look in only a handful of the
> emails...But the problem with having one index per emails is the
> massive number of emails...imagine having 100000 indexes
>
> Anyway, any idea about that? I just wanted to check wether someones
> feels I am wrong.
>
> Thanks
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message