lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ambiese...@gmx.de
Subject XML support in Lucene
Date Tue, 25 Nov 2003 16:07:24 GMT
Hello group,

does Lucene offer an effective and flexible way to treat XML files. I know
that as soon as an InputStream is provided Lucene can basically index (evtl.
after clearning) everything. How is it with XML files?

If there is a way is it possbile to have one big XML file with many
individual parts in it. This should be considered as docuemnts and the repeative XML
tags as fields.

Here an example:

<MySMSList>
  <SMS>
    <From>Tim</From>
    <Content>How are you? Tom</Content>
  </SMS>
  <SMS>
     <From>Linda</From>
    <Content>bla bla bla</Content>
  <SMS>
</MySMSList>


Does somebody has already developed classes which go though this XML file,
create TWO documents with the fields "From" and "Content" and fill in the text
between the tags ? The Indexing business should then be the same since it is
abstract against the Document object. The same for the search process. The
search process however could be optimised with stuctural information (i.e.
only search in "Content")...

Cheers,
Ralph

-- 
NEU FÜR ALLE - GMX MediaCenter - für Fotos, Musik, Dateien...
Fotoalbum, File Sharing, MMS, Multimedia-Gruß, GMX FotoService

Jetzt kostenlos anmelden unter http://www.gmx.net

+++ GMX - die erste Adresse für Mail, Message, More! +++


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message