lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Brandon Jockman" <brand...@isogen.com>
Subject Re: Indexing multiple equiv fieldnames (revisited)
Date Wed, 29 May 2002 15:38:03 GMT
If you look on the contributions page, W. Eliot Kimber submitted a solution
months ago that does exactly what you are recommending to do. It is a good
example of what you need to do to properly index XML documents with Lucene
in order to preserve document structure. If you use that solution, you will
want to limit the elements that get indexed to be specific to your
application needs 'ie. only index customer elements'. This will keep
performance manageable.

I recommend checking out that XML indexing solution and searching the mail
archive for related emails.

Many XML indexing issues have already been addressed.

Regards,

-Brandon

Brandon Jockman
- ISOGEN International, LLC.
- brandonj@isogen.com

-=-


----- Original Message -----
From: "Mohamed Idrees" <idrees@planet-connections.net>
To: <lucene-user@jakarta.apache.org>
Sent: Tuesday, May 21, 2002 2:39 AM
Subject: Indexing multiple equiv fieldnames (revisited)


> Since the following discussions were not posted in the forum, I am doing
> it now.
>
>
>
> If you just paste this message in the document.java file in the
> lucene code. Once you've recompiled the Lucene Source, the new jar will
> contain the method doc.getArray(name , maxCount).
> fieldList is the list of all the fields in the XML document.
>
> Your problem is that all the XML fields are in the same file, so as far
> as
> Lucene is concerned,
> you're searching the same document, what you need to do is either
> separate
> the customers over
> multiple files or write a little XML parsing method using Xerces or JDom
>
> that will read this customer
> XML file and then store each Customer in a separate Lucene Document, at
> witch point you won't need
> the getArray function because you'll have the logical separation in the
> Lucene Index.
>
> PS: please post this conversation on the forum so other people can
> collaborate
>
>
>
> -----Original Message-----
> From: Mohamed Idrees [mailto:idrees@planet-connections.net]
> Sent: Tuesday, May 21, 2002 7:44 AM
> To: nsh@bayt.net
> Subject: Indexing multiple equiv fieldnames (revisited)
>
>
> hi Nader,
>
> Reg:
> http://www.mail-archive.com/lucene-user@jakarta.apache.org/msg01580.html
>
> You have come out exactly the same problem I'm facing.  In the following
>
> code of yours what does fieldList refer to (if possible can you mail me
> the complete code of using this method):
>
>  /** Returns an Array of String variables representing the Text Values
> of a specific <i>name</i>
>    * name: the name of the field
>    * maxCount : the number of elemets you want to retreive (also helps
>    * initialize the size of the array)
>  */
>
>  public final String[] getArray(String name , int maxCount) {
>   int counter = 0 ;
>   String[] tempArray = new String[maxCount] ;
>   for (DocumentFieldList list = fieldList; list != null; list =
> list.next) {
>    if (list.field.name().equals(name) && counter < maxCount &&
> list.field.stringValue() != null ) {
>     tempArray[counter] = list.field.stringValue() ;
>     counter++ ;
>    }
>   }
>   // triming the output to the relevant size
>   String[] outputArray = new String[counter] ;
>   System.arraycopy(tempArray, 0, outputArray, 0, counter);
>   return outputArray;
>  }
>
> Moreover, this method seems to retrieve all the values of the same field
>
> from the whole document.  What if suppose I need to retrieve only the
> particular value as in the following example:
>
> <main>
>   <customer>
>      <name>Idrees</name>
>      <address>Deira, Dubai</address>
>      <country>UAE</country>
>      <phone>2976636</phone>
>   </customer>
>   <customer>
>      <name>X</name>
>      <address>Jalan Yap Kwan Seng</address>
>      <country>Malaysia</country>
>      <phone>0163773435</phone>
>   </customer>
>   <customer>
>      <name>Nader</name>
>      <address>Karama</address>
>      <country>UAE</country>
>      <phone>3911900</phone>
>   </customer>
> </main>
>
> Suppose I want to retrieve "address" where "country=UAE and
> name=Idrees", the present lucene Document displays "Karama" for me and
> not "Deira,Dubai".
>
> Thanking you in advance.
>
> Regards,
> Idrees
>
>


----------------------------------------------------------------------------
----


> --
> To unsubscribe, e-mail:
<mailto:lucene-user-unsubscribe@jakarta.apache.org>
> For additional commands, e-mail:
<mailto:lucene-user-help@jakarta.apache.org>


--
To unsubscribe, e-mail:   <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>


Mime
View raw message