Return-Path: Delivered-To: apmail-jakarta-lucene-dev-archive@apache.org Received: (qmail 58637 invoked from network); 29 Nov 2001 16:45:34 -0000 Received: from unknown (HELO nagoya.betaversion.org) (192.18.49.131) by daedalus.apache.org with SMTP; 29 Nov 2001 16:45:34 -0000 Received: (qmail 24792 invoked by uid 97); 29 Nov 2001 16:45:33 -0000 Delivered-To: qmlist-jakarta-archive-lucene-dev@jakarta.apache.org Received: (qmail 24717 invoked by uid 97); 29 Nov 2001 16:45:32 -0000 Mailing-List: contact lucene-dev-help@jakarta.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Subscribe: List-Help: List-Post: List-Id: "Lucene Developers List" Reply-To: "Lucene Developers List" Delivered-To: mailing list lucene-dev@jakarta.apache.org Received: (qmail 24673 invoked from network); 29 Nov 2001 16:45:32 -0000 Message-ID: <20011129164531.68123.qmail@web20007.mail.yahoo.com> Date: Thu, 29 Nov 2001 08:45:31 -0800 (PST) From: Yiyi Sun Subject: Re: parsing XML To: Lucene Developers List In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N Hi, Thanks a lot. I would like to have you XML package and demo. Cheers! Yiyi --- "Ogren, Philip V." wrote: > > I didn't pour through the archive to make sure no > one had done this yet > but... > I have a generic way of indexing XML that I think is > really useful. > Basically, I implement the DefaultHandler (in SAX) > that handles XML > documents that look like something like this: > > token="true">a > small field > token="true">a > large field > > > I haven't actually written a DTD or schema because I > haven't needed one > yet.* I create a org.apache.lucene.document.Field > for each 'field' tag that > is processed. The way I get an XML document that > conforms to this very > simplistic schema is through XSLT. You simply > create a style sheet that > transforms your specific xml document into xml that > conforms with the above > tags. It's proven very useful on our project > because changing the way an > xml document is indexed requires no change in the > code - I simply change my > style sheet and reindex. > > I would be willing to cut a version of this code > that would be suitable for > a demonstration - along with a demo - if there is > any interest. > > Regards, > Philip Ogren > > *I originally had a 'datefield' tag as well but I > found the DateField class > to be useless for my application as it doesn't > handle dates before 1970. > > > Philip V. Ogren > > Medical Information Resources > > Mayo Clinic Rochester > > (507) 538-0167 > > ogren@mayo.edu > > > > -- > To unsubscribe, e-mail: > > For additional commands, e-mail: > > __________________________________________________ Do You Yahoo!? Yahoo! GeoCities - quick and easy web site hosting, just $8.95/month. http://geocities.yahoo.com/ps/info1 -- To unsubscribe, e-mail: For additional commands, e-mail: