lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From crack...@comcast.net
Subject Re: Indexing Complex XML
Date Sun, 19 Apr 2009 19:05:53 GMT
try vtd-xml 
http://vtd-xml.sf.net 
it works with any XML regardless of complexity 
----- Original Message ----- 
From: "Digy" <digydigy@gmail.com> 
To: java-user@lucene.apache.org 
Sent: Saturday, April 18, 2009 12:25:21 PM GMT -08:00 US/Canada Pacific 
Subject: RE: Indexing Complex XML 

doc.add(new Field("authors", "name1 surname1 name2 surmane2", StoreOption, 
IndexOption); 

So you can make a search like 
authors:"name1 surname1" 

(Disadvantage: you will also get result with a search like authors:"surname1 
name2" ) 
DIGY 

-----Original Message----- 
From: Daniel Susanto [mailto:daniel_sus777@yahoo.com] 
Sent: Saturday, April 18, 2009 9:09 PM 
To: java-user@lucene.apache.org 
Subject: Re: Indexing Complex XML 

Thanks Erick, 

In more complex xml I mean, for example this xml: 

<root> 
<book> 
<title>Lucene Book</title> 
<authors> 
<author>Book author 1</author> 
<author>Book author 2</author> 
</authors> 
<summary>Book for Lucene</summary> 
</book> 
<book> 

<title>Lucene Book 2</title> 

<authors> 

<author>Book 2 author 1</author> 

<author>Book 2 author 2</author> 


</authors> 

<summary>Book 2 for Lucene</summary> 


</book> 
</root> 

for each 'book' node is handled by one Document rite? and now 
how to handle the 'authors' node? should I put in new Document? or how? 

thx. :) 
Daniel 
Daniel Susanto 
http://susantodaniel.wordpress.com 

--- On Sun, 4/19/09, Erick Erickson <erickerickson@gmail.com> wrote: 

From: Erick Erickson <erickerickson@gmail.com> 
Subject: Re: Indexing Complex XML 
To: java-user@lucene.apache.org 
Date: Sunday, April 19, 2009, 12:01 AM 

Lucene is an *engine*, not an application. *You* have to process the 
XML, decide what the structure of your index is and index the data. There 
are many 
XML parser options, this is just straight Java code. You'll decide 
what's relevant, and add the contents of the relevant elements to a Lucene 
document 
then add that to your index. 

Similarly for searching. 

So, say you have the following simple XML doc 
<root> 
<ele1>ele 1 text</ele1> 
<ele2>ele 2 text</ele2> 
</root> 

You'd have to parse that text, then, say, add (semi-pseudo-code) 
Document doc = new Document() 
doc.add(new Field("ele1field", "ele 1 text", StoreOPtion, IndexOption); 
doc.add(new Field("ele2field", "ele 2 text", StoreOption, IndexOption); 
writer.add(doc); 

Then at search time you'd form your queries on "ele1field" and ele2field". 

HTH 
Erick 

On Sat, Apr 18, 2009 at 11:19 AM, daniel susanto 
<daniel_sus777@yahoo.com>wrote: 

> Hi, 
> 
> I need advise or example to index complex XML file, I mean the XML note 
> just in one level node but more than one. for example indexing rss or 
atom. 
> 
> thx b4. 
> Daniel Susanto 
> http://susantodaniel.wordpress.com 
> 
> 
> 






--------------------------------------------------------------------- 
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org 
For additional commands, e-mail: java-user-help@lucene.apache.org 


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message