Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm
Precedence: bulk
Reply-To: java-user@lucene.apache.org
Received-SPF: pass (athena.apache.org: domain of digydigy@gmail.com designates
 72.14.220.157 as permitted sender)
DomainKey-Signature: a=rsa-sha1; c=nofws;
        d=gmail.com; s=gamma;
        h=from:to:references:in-reply-to:subject:date:message-id:mime-version
         :content-type:content-transfer-encoding:x-mailer:thread-index
         :content-language;
        b=hAjnyPmusbOXY5qpdv4NZmCFezx2Bo/DAKl/lTm3XQVpL8Pr0NzUk0iZo0Cs6zp0AL
         kp48Ibr1i5WBdeA0EfhNem/fot5hbElISAzRmZRcG/CV055TkMNY3Fu/BL7Yq/XDUide
         rqxYlYKZkmWoPoNCvatzCvMF6Bn5sllLIxDlQ=
From: "Digy" <digydigy@gmail.com>
To: <java-user@lucene.apache.org>
References: <407281.7693.qm@web76002.mail.sg1.yahoo.com>
In-Reply-To: <407281.7693.qm@web76002.mail.sg1.yahoo.com>
Subject: RE: Indexing Complex XML
Date: Sat, 18 Apr 2009 22:25:21 +0300
Message-ID: <001f01c9c05b$6b0f94f0$412ebed0$@com>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-9"
Content-Transfer-Encoding: quoted-printable
thread-index: AcnAUOPjI5imlu4wRWig5aLGXJQ7fgACZYgw
Content-Language: tr

doc.add(new Field("authors", "name1 surname1 name2 surmane2", =
StoreOption,
IndexOption);=20

So you can make a search like=20
	authors:"name1 surname1"

(Disadvantage: you will also get result with a search like =
authors:"surname1
name2" )
DIGY

-----Original Message-----
From: Daniel Susanto [mailto:daniel_sus777@yahoo.com]=20
Sent: Saturday, April 18, 2009 9:09 PM
To: java-user@lucene.apache.org
Subject: Re: Indexing Complex XML

Thanks Erick,

In more complex xml I mean, for example this xml:

<root>
<book>
<title>Lucene Book</title>
<authors>
<author>Book author 1</author>
<author>Book author 2</author>
</authors>
<summary>Book for Lucene</summary>
</book>
<book>

<title>Lucene Book 2</title>

<authors>

<author>Book 2 author 1</author>

<author>Book 2 author 2</author>


</authors>

<summary>Book 2 for Lucene</summary>


</book>
</root>

for each 'book' node is handled by one Document rite? and now
how to handle the 'authors' node? should I put in new Document? or how?

thx. :)
Daniel
Daniel Susanto
http://susantodaniel.wordpress.com

--- On Sun, 4/19/09, Erick Erickson <erickerickson@gmail.com> wrote:

From: Erick Erickson <erickerickson@gmail.com>
Subject: Re: Indexing Complex XML
To: java-user@lucene.apache.org
Date: Sunday, April 19, 2009, 12:01 AM

Lucene is an *engine*, not an application. *You* have to process the
XML, decide what the structure of your index is and index the data. =
There
are many
XML parser options, this is just straight Java code. You'll decide
what's relevant, and add the contents of the relevant elements to a =
Lucene
document
then add that to your index.

Similarly for searching.

So, say you have the following simple XML doc
<root>
=A0=A0=A0<ele1>ele 1 text</ele1>
=A0=A0=A0<ele2>ele 2 text</ele2>
</root>

You'd have to parse that text, then, say, add (semi-pseudo-code)
Document doc =3D new Document()
doc.add(new Field("ele1field", "ele 1 text", StoreOPtion, IndexOption);
doc.add(new Field("ele2field", "ele 2 text", StoreOption, IndexOption);
writer.add(doc);

Then at search time you'd form your queries on "ele1field" and =
ele2field".

HTH
Erick

On Sat, Apr 18, 2009 at 11:19 AM, daniel susanto
<daniel_sus777@yahoo.com>wrote:

> Hi,
>
> I need advise or example to index complex XML file, I mean the XML =
note
> just in one level node but more than one. for example indexing rss or
atom.
>
> thx b4.
> Daniel Susanto
> http://susantodaniel.wordpress.com
>
>
>


     =20


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org