In more complex xml I mean, for example this xml:
Book author 1
Book author 2
Book for Lucene
Lucene Book 2
Book 2 author 1
Book 2 author 2
Book 2 for Lucene
for each 'book' node is handled by one Document rite? and now
how to handle the 'authors' node? should I put in new Document? or how?
--- On Sun, 4/19/09, Erick Erickson wrote:
From: Erick Erickson
Subject: Re: Indexing Complex XML
Date: Sunday, April 19, 2009, 12:01 AM
Lucene is an *engine*, not an application. *You* have to process the
XML, decide what the structure of your index is and index the data. There
XML parser options, this is just straight Java code. You'll decide
what's relevant, and add the contents of the relevant elements to a Lucene
then add that to your index.
Similarly for searching.
So, say you have the following simple XML doc
ele 1 text
ele 2 text
You'd have to parse that text, then, say, add (semi-pseudo-code)
Document doc = new Document()
doc.add(new Field("ele1field", "ele 1 text", StoreOPtion, IndexOption);
doc.add(new Field("ele2field", "ele 2 text", StoreOption, IndexOption);
Then at search time you'd form your queries on "ele1field" and ele2field".
On Sat, Apr 18, 2009 at 11:19 AM, daniel susanto wrote:
> I need advise or example to index complex XML file, I mean the XML note
> just in one level node but more than one. for example indexing rss or atom.
> thx b4.
> Daniel Susanto