Return-Path: Mailing-List: contact lucene-user-help@jakarta.apache.org; run by ezmlm Delivered-To: mailing list lucene-user@jakarta.apache.org Received: (qmail 25355 invoked from network); 11 Oct 2001 23:48:11 -0000 Received: from unknown (HELO mta.12.com) (65.198.8.41) by daedalus.apache.org with SMTP; 11 Oct 2001 23:48:11 -0000 Received: (qmail 6343 invoked from network); 11 Oct 2001 23:45:40 -0000 Received: from unknown (HELO riker.grandcentral.com) (10.102.15.55) by mta.12.com with SMTP; 11 Oct 2001 23:45:40 -0000 Received: by mail.grandcentral.com with Internet Mail Service (5.5.2653.19) id <42Y1HTTC>; Thu, 11 Oct 2001 16:37:41 -0700 Message-ID: <4BC270C6AB8AD411AD0B00B0D0493DF0EE7C71@mail.grandcentral.com> From: Doug Cutting To: "'eliot@isogen.com'" , lucene-user@jakarta.apache.org Subject: RE: Index Optimization: Which is Better? Date: Thu, 11 Oct 2001 16:37:37 -0700 MIME-Version: 1.0 X-Mailer: Internet Mail Service (5.5.2653.19) Content-Type: text/plain; charset="iso-8859-1" X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N Elliot, I'm having trouble getting a clear picture of your indexing scheme. Could you provide some simple examples, e.g., for the xml: this is some text and some other text would you have something like the following? doc1 node_type: tag1 contents: this is some text doc2 node_type: tag2 contents: and some other text doc3 node_type: all_contents contents: this is some text and some other text That would help me. My first instinct would be to have something like: doc1 tag1: this is some text tag2: and some other text all-tags: this is some text and some other text What do you need that that does not achieve? Doug