Mailing-List: contact lucene-user-help@jakarta.apache.org; run by ezmlm
Precedence: bulk
Reply-To: "Lucene Users List" <lucene-user@jakarta.apache.org>
Message-Id: <sfd0547f.033@gwia201.syr.edu>
Date: Fri, 05 Dec 2003 09:48:36 -0500
From: "Grant Ingersoll" <gsingers@syr.edu>
To: <lucene-user@jakarta.apache.org>
Subject: Index and Field.Text
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline

Hi,

I have seen the example SAX based XML processing in the Lucene sandbox =
(thanks to the authors for contributing!) and have successfully adapted =
this approach for my application.  The one thing that does not sit well =
with me is the fact that I am using the method Field.Text(String, String) =
instead of the Field.Text(String, Reader) version, which means I am =
storing the contents in the index.

Some questions:

1. Should I care?  What is the cost of storing the contents of these files =
versus using the Reader based method.  Presumably, the index size is going =
to be larger, but will it adversaly effect search time?  If yes, how much =
so (relatively speaking)?

2. If storing the content is going to adversaly effect searching, has =
anyone written an XMLReader that extends java.io.Reader.  I guess it would =
need to take in the name of the tag(s) that you want the reader to =
retrieve and then extend all of the java.io.Reader results to return =
values based on just the tag values that I am interested in.  Has anyone =
taken this approach?  If not, does it at least seem like a valid approach?

Thanks for your help!

-Grant Ingersoll


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org