Return-Path: Delivered-To: apmail-jakarta-lucene-user-archive@apache.org Received: (qmail 99309 invoked from network); 19 Mar 2003 16:44:14 -0000 Received: from exchange.sun.com (192.18.33.10) by daedalus.apache.org with SMTP; 19 Mar 2003 16:44:14 -0000 Received: (qmail 14445 invoked by uid 97); 19 Mar 2003 16:46:00 -0000 Delivered-To: qmlist-jakarta-archive-lucene-user@nagoya.betaversion.org Received: (qmail 14438 invoked from network); 19 Mar 2003 16:45:59 -0000 Received: from daedalus.apache.org (HELO apache.org) (208.185.179.12) by nagoya.betaversion.org with SMTP; 19 Mar 2003 16:45:59 -0000 Received: (qmail 95991 invoked by uid 500); 19 Mar 2003 16:43:35 -0000 Mailing-List: contact lucene-user-help@jakarta.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Subscribe: List-Help: List-Post: List-Id: "Lucene Users List" Reply-To: "Lucene Users List" Delivered-To: mailing list lucene-user@jakarta.apache.org Received: (qmail 95926 invoked from network); 19 Mar 2003 16:43:34 -0000 Received: from gw.baseview.com (HELO baseview.com) (12.37.93.2) by daedalus.apache.org with SMTP; 19 Mar 2003 16:43:34 -0000 Received: from [10.1.0.1] (account avi_drissman HELO [10.1.4.99]) by baseview.com (CommuniGate Pro SMTP 4.0.6) with ESMTP id 4212486 for lucene-user@jakarta.apache.org; Wed, 19 Mar 2003 11:42:40 -0500 Mime-Version: 1.0 X-Sender: avi_drissman@mail.baseview.com Message-Id: Date: Wed, 19 Mar 2003 11:43:26 -0500 To: Lucene Users List From: Avi Drissman Subject: Putting the Lucene index into a database Content-Type: text/plain; charset="us-ascii" ; format="flowed" X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N I've successfully used Lucene to do indexing of about 50-100K files, and have been keeping the index on a local disk. It's time to move up, and now I'm planning to index from 100-500K files. I'm trying to decide whether or not it pays to hold the index in our database. Our database (FrontBase) has decent blob support, and a ~300 meg index likely wouldn't faze it, but I have some concerns. First, I'm looking at Directory, and there are two functions: * OutputStream createFile(String name) * InputStream openFile(String name) How much of the streams do they take advantage of? Does Lucene seek around? I'm concerned about huge re-writing of files. Second is speed. I was looking at SQLDirectory, and although I'd probably write my own (inspired by that), who's using it? How is the speed compared to flat-files? Third is replication. We're aiming for a replicated environment. If we wanted to build the index on the disk rather than in the database, every server would have to keep their own copy. Does anyone have any experience in this? Thanks. Avi -- Avi 'rlwimi' Drissman avi@baseview.com Argh! This darn mail server is trunca --------------------------------------------------------------------- To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org For additional commands, e-mail: lucene-user-help@jakarta.apache.org