Return-Path: Delivered-To: apmail-jakarta-lucene-user-archive@www.apache.org Received: (qmail 34929 invoked from network); 5 Dec 2003 14:48:55 -0000 Received: from daedalus.apache.org (HELO mail.apache.org) (208.185.179.12) by minotaur-2.apache.org with SMTP; 5 Dec 2003 14:48:55 -0000 Received: (qmail 59175 invoked by uid 500); 5 Dec 2003 14:48:46 -0000 Delivered-To: apmail-jakarta-lucene-user-archive@jakarta.apache.org Received: (qmail 59142 invoked by uid 500); 5 Dec 2003 14:48:46 -0000 Mailing-List: contact lucene-user-help@jakarta.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Subscribe: List-Help: List-Post: List-Id: "Lucene Users List" Reply-To: "Lucene Users List" Delivered-To: mailing list lucene-user@jakarta.apache.org Received: (qmail 59129 invoked from network); 5 Dec 2003 14:48:45 -0000 Received: from unknown (HELO gwia201.syr.edu) (128.230.248.25) by daedalus.apache.org with SMTP; 5 Dec 2003 14:48:45 -0000 Received: from MTA2-MTA by gwia201.syr.edu with Novell_GroupWise; Fri, 05 Dec 2003 09:48:47 -0500 Message-Id: X-Mailer: Novell GroupWise Internet Agent 6.0.4 Date: Fri, 05 Dec 2003 09:48:36 -0500 From: "Grant Ingersoll" To: Subject: Index and Field.Text Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: quoted-printable Content-Disposition: inline X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N X-Spam-Rating: minotaur-2.apache.org 1.6.2 0/1000/N Hi, I have seen the example SAX based XML processing in the Lucene sandbox = (thanks to the authors for contributing!) and have successfully adapted = this approach for my application. The one thing that does not sit well = with me is the fact that I am using the method Field.Text(String, String) = instead of the Field.Text(String, Reader) version, which means I am = storing the contents in the index. Some questions: 1. Should I care? What is the cost of storing the contents of these files = versus using the Reader based method. Presumably, the index size is going = to be larger, but will it adversaly effect search time? If yes, how much = so (relatively speaking)? 2. If storing the content is going to adversaly effect searching, has = anyone written an XMLReader that extends java.io.Reader. I guess it would = need to take in the name of the tag(s) that you want the reader to = retrieve and then extend all of the java.io.Reader results to return = values based on just the tag values that I am interested in. Has anyone = taken this approach? If not, does it at least seem like a valid approach? Thanks for your help! -Grant Ingersoll --------------------------------------------------------------------- To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org For additional commands, e-mail: lucene-user-help@jakarta.apache.org