lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From fulin tang <tangfu...@gmail.com>
Subject Re: Is Lucene a good choice for PB scale mailbox search?
Date Thu, 26 Nov 2009 01:34:58 GMT
Thanks all for the good suggestions !

But any idea of the storage? How can we make the indexes as small as possible?

We know compressing is the only way, but when and where to compress is
best for search?

Thanks all again!


2009/11/24 Kay Kay <kaykay.unique@gmail.com>:
> fulin tang wrote:
>>
>> We are going to add full-text search for our mailbox service .
>>
>> The problem is we have more than 1 PB mails there , and obviously we
>> don't want to add another PB storage for search service , so we hope
>> the index data will be small enough for storage while the search keeps
>> fast .
>>
>> The lucky is that every user just search with mails of their own , so
>> we can split the data into a lot of indexes instead of keeping them in
>> a big one .
>>
>
> If it is going to be sharded by the 'To' or 'Cc' list - then potentially the
> mail information is going to be duplicated proportional to the number of
> people in an email thread. Selecting some other dimension like time, for
> sharding  might be useful to begin with.
>>
>> So, after all these concerns ,  the question is , is lucene a good
>> choice for this ? or which is the right way to do this ? Does anyone
>> have done this  before ?
>>
>
> With PB of storage - check out solr sharding / katta for prior work in this
> arena.
>
>> All opinions and comments are welcome !
>>
>> fulin
>>
>>
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>



-- 
梦的开始挣扎于城市的边缘
心的远方执着在脚步的瞬间
我的宿命埋藏了寂寞的永远

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message