Return-Path: Delivered-To: apmail-jakarta-lucene-dev-archive@apache.org Received: (qmail 1319 invoked from network); 10 Jun 2003 16:09:24 -0000 Received: from exchange.sun.com (192.18.33.10) by daedalus.apache.org with SMTP; 10 Jun 2003 16:09:24 -0000 Received: (qmail 14925 invoked by uid 97); 10 Jun 2003 16:11:42 -0000 Delivered-To: qmlist-jakarta-archive-lucene-dev@nagoya.betaversion.org Received: (qmail 14918 invoked from network); 10 Jun 2003 16:11:42 -0000 Received: from daedalus.apache.org (HELO apache.org) (208.185.179.12) by nagoya.betaversion.org with SMTP; 10 Jun 2003 16:11:42 -0000 Received: (qmail 1058 invoked by uid 500); 10 Jun 2003 16:09:19 -0000 Mailing-List: contact lucene-dev-help@jakarta.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Subscribe: List-Help: List-Post: List-Id: "Lucene Developers List" Reply-To: "Lucene Developers List" Delivered-To: mailing list lucene-dev@jakarta.apache.org Received: (qmail 1046 invoked from network); 10 Jun 2003 16:09:19 -0000 Received: from mail.hksi.net (HELO netwebapps.com) (204.118.40.43) by daedalus.apache.org with SMTP; 10 Jun 2003 16:09:19 -0000 Received: from vincent [65.28.50.217] by netwebapps.com with ESMTP (SMTPD32-8.00) id A1DB42B40094; Tue, 10 Jun 2003 11:05:47 -0500 Message-ID: <003601c32f6a$a8858f70$d9321c41@vincent> Reply-To: "Bryan LaPlante" From: "Bryan LaPlante" To: Subject: Indexing a database Date: Tue, 10 Jun 2003 11:09:23 -0500 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2600.0000 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2600.0000 X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N Hi, I need some input about what direction to take. I have written a package for indexing a database using a query or list of tables to be indexed. I wanted control over how each column in each row of the result gets indexed or not. The structure first and then the problem where advice is needed. Structure: //create an instance of : ds = DataStore(String driver,String uri,String pswd, String user) // pass ds to: dir = DSDirectory(DataStore ds,String query); Calling dir.list() now will produce the entire resultset made up of a DSFile() representing a row in the set and a Hashtable of attributes stored internally to the dsfile object representing each column. The problem: If you didn't guess before, this is quite memory intensive when you are talking about a sizable recordset. I need a way to either hold a reference to the records and let the user incrementally request the next n number of rows to index or I need to store the records in a temporary location (non-memory) where they can be retrieved on request. Possible solution: I have one idea to create a temp Lucene index representing the rows and columns and then when the list method is called I could retrieve the data from the index there by not have the in memory constraints. Thoughts? Bryan LaPlante --------------------------------------------------------------------- To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org For additional commands, e-mail: lucene-dev-help@jakarta.apache.org