Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 54080 invoked from network); 24 Nov 2009 03:59:52 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 24 Nov 2009 03:59:52 -0000 Received: (qmail 51514 invoked by uid 500); 24 Nov 2009 03:59:50 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 51325 invoked by uid 500); 24 Nov 2009 03:59:47 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 51315 invoked by uid 99); 24 Nov 2009 03:59:47 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 24 Nov 2009 03:59:47 +0000 X-ASF-Spam-Status: No, hits=-1.3 required=5.0 tests=AWL,BAYES_50,HTML_MESSAGE X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of shashi.mit@gmail.com designates 209.85.219.225 as permitted sender) Received: from [209.85.219.225] (HELO mail-ew0-f225.google.com) (209.85.219.225) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 24 Nov 2009 03:59:44 +0000 Received: by ewy25 with SMTP id 25so63422ewy.5 for ; Mon, 23 Nov 2009 19:59:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:sender:received:in-reply-to :references:from:date:x-google-sender-auth:message-id:subject:to :content-type; bh=MrvKQJXe+KJs91EARXnX9WsIvAynRKBQfRTjInmGVOY=; b=U2FLly3kelyRPXtbahqnJf6lbYDP1k1OxNFyPW8UvHcbYO/U8Gm5keNSzvfzsn5Sjb vfsM5rhEx7zQn4sSWBftOqiXpBiW5PMsiam1kFY87ZHsFfgQEXZJoTvtmRUbpg9fd71Q AhAUDvAlt6Zfk2P3R74sz724Wl5tc6v8suGwM= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:from:date :x-google-sender-auth:message-id:subject:to:content-type; b=T7MfwDvaiOFSMVXl/wnMlQJzIsa5PlvUkJv5UCX5FCkMJ7KoMjg4MvRnBiWdoKvLng jkmtSWnY3h3/HKYOwY9bV/Cu9Xkvb+b+UQvf+xjd1O2DG/AtI2WsKgMHR4SnzT9i1gJS OQDi9TVYuyoCDiprrp1j94XG9BCLRbYGgt5B4= MIME-Version: 1.0 Sender: shashi.mit@gmail.com Received: by 10.216.89.213 with SMTP id c63mr687923wef.211.1259035163128; Mon, 23 Nov 2009 19:59:23 -0800 (PST) In-Reply-To: <99912d2d0911231835j41d2831at2975a22f50880df1@mail.gmail.com> References: <99912d2d0911231835j41d2831at2975a22f50880df1@mail.gmail.com> From: Shashi Kant Date: Mon, 23 Nov 2009 22:59:03 -0500 X-Google-Sender-Auth: f5647f7732a66844 Message-ID: <4d19a3630911231959x51fe9229t156d026eebf9be22@mail.gmail.com> Subject: Re: Is Lucene a good choice for PB scale mailbox search? To: java-user@lucene.apache.org Content-Type: multipart/alternative; boundary=0016e6d464cca8730e047915f73d --0016e6d464cca8730e047915f73d Content-Type: text/plain; charset=GB2312 Content-Transfer-Encoding: quoted-printable Hi, I have not worked on a petascale (yet!) - mostly on the scale of tens o= f terabyes - but I do think Lucene would be very helpful for such usecase. I would indeed suggest partitioning the index by users (seems the most logical., straightforward way, also offers the security of insulating one user's emails from others. Take a look at Compass and Solr (based on Lucene) and they might be more oriented to your needs. HTH, Shashi On Mon, Nov 23, 2009 at 9:35 PM, fulin tang wrote: > We are going to add full-text search for our mailbox service . > > The problem is we have more than 1 PB mails there , and obviously we > don't want to add another PB storage for search service , so we hope > the index data will be small enough for storage while the search keeps > fast . > > The lucky is that every user just search with mails of their own , so > we can split the data into a lot of indexes instead of keeping them in > a big one . > > So, after all these concerns , the question is , is lucene a good > choice for this ? or which is the right way to do this ? Does anyone > have done this before ? > > All opinions and comments are welcome ! > > fulin > > > -- > =C3=CE=B5=C4=BF=AA=CA=BC=D5=F5=D4=FA=D3=DA=B3=C7=CA=D0=B5=C4=B1=DF=D4=B5 > =D0=C4=B5=C4=D4=B6=B7=BD=D6=B4=D7=C5=D4=DA=BD=C5=B2=BD=B5=C4=CB=B2=BC=E4 > =CE=D2=B5=C4=CB=DE=C3=FC=C2=F1=B2=D8=C1=CB=BC=C5=C4=AF=B5=C4=D3=C0=D4=B6 > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org > > --0016e6d464cca8730e047915f73d--