Return-Path: Delivered-To: apmail-jakarta-lucene-user-archive@www.apache.org Received: (qmail 76536 invoked from network); 2 Dec 2003 23:38:23 -0000 Received: from daedalus.apache.org (HELO mail.apache.org) (208.185.179.12) by minotaur-2.apache.org with SMTP; 2 Dec 2003 23:38:23 -0000 Received: (qmail 22330 invoked by uid 500); 2 Dec 2003 23:38:06 -0000 Delivered-To: apmail-jakarta-lucene-user-archive@jakarta.apache.org Received: (qmail 22097 invoked by uid 500); 2 Dec 2003 23:38:04 -0000 Mailing-List: contact lucene-user-help@jakarta.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Subscribe: List-Help: List-Post: List-Id: "Lucene Users List" Reply-To: "Lucene Users List" Delivered-To: mailing list lucene-user@jakarta.apache.org Received: (qmail 22083 invoked from network); 2 Dec 2003 23:38:04 -0000 Received: from unknown (HELO rlx13.zapatec.com) (66.117.144.213) by daedalus.apache.org with SMTP; 2 Dec 2003 23:38:04 -0000 Received: from rlx11.zapatec.com (rlx11.pr.zapatec.com [192.168.1.132]) by rlx13.zapatec.com (Postfix) with ESMTP id 3C6D1A941 for ; Tue, 2 Dec 2003 15:38:11 -0800 (PST) Received: (from dror@localhost) by rlx11.zapatec.com (8.12.3/8.12.3/Submit) id hB2NcBcG055627 for lucene-user@jakarta.apache.org; Tue, 2 Dec 2003 15:38:11 -0800 (PST) (envelope-from dror) Date: Tue, 2 Dec 2003 15:38:11 -0800 From: Dror Matalon To: Lucene-Users-List Subject: Re: Ways to search indexes Message-ID: <20031202233811.GG34796@rlx11.zapatec.com> References: <20031202135458.73205.qmail@web25204.mail.ukl.yahoo.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20031202135458.73205.qmail@web25204.mail.ukl.yahoo.com> X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N X-Spam-Rating: minotaur-2.apache.org 1.6.2 0/1000/N On Tue, Dec 02, 2003 at 01:54:58PM +0000, jt oob wrote: > Hi, > > I have just indexed a lot of news (nntp) postings. > I now have an index for each topic (a topic can have many newsgroups) > > The index sizes are: > > 2.6G Current Affairs > 2.4G Celebs > 119M Recreation > 3.0M Tech - Mac > 2.4G Tech - Windows > 936M Tech - Linux > 702M Tech - Other > 96M Tech - Consoles Around 15 gigs. How many days of news? > > This is still only early stages so i haven't yet done any parsing, just > treating each doc as plain text. > > Originally I was merging all these indexes together, but this is now > not feasible with new additions being made to each index as new > postings arrive. > I optimize each index at midnight. > > What is the best way to allow users to query either just one index, or > the whole lot? Probably, create a IndexSearcher for each index and then use a MultiSearcher to search them all together. It'll probably use quite a bit of memory. > > My prototype was making a system call from and running my java program > to print all the results to the screen. I know this isn't the best way > to do it :-) > > I guess I need to write a server and periodically re-open the indexes > to see any changes? > > Thank you for any help! > > jt > > ________________________________________________________________________ > Download Yahoo! Messenger now for a chance to win Live At Knebworth DVDs > http://www.yahoo.co.uk/robbiewilliams > > --------------------------------------------------------------------- > To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org > For additional commands, e-mail: lucene-user-help@jakarta.apache.org > -- Dror Matalon Zapatec Inc 1700 MLK Way Berkeley, CA 94709 http://www.fastbuzz.com http://www.zapatec.com --------------------------------------------------------------------- To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org For additional commands, e-mail: lucene-user-help@jakarta.apache.org