lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "varun sood" <vso...@gmail.com>
Subject Re: Ignoring XML tags when creating an index with Lucene
Date Thu, 02 Mar 2006 18:02:26 GMT
Hi,
 You can use SAX2 / DOM parser to parse XML document before submiting it to
IndexWriter (Luncene) and select only those tags which you want to index.
Its fairly easy to implement XML parser.

Hope this helps.
Varun
**
On 3/2/06, Cian O'Maidin <CianOMaidin@celsussolutions.com> wrote:
>
> Hi ,
>
>             I am currently trying to Full-Text-Search Enable an
> application
> server that deals solely in one type of XML document.(Of my design)
>
> Currently I'm sending bitmap files embedded in an <IMAGE> tag in the XML,
> the bitmaps are encoded using Base64.
>
>
>
> I want to exclude these from the index that Lucene creates as they are
> seriously bulking up the size of the index.
>
>
>
> Can you suggest anything please?
>
>
>
> Cian
>
>
>
> Cian Ó Maidín,
>
> Managing Director,
>
> Celsus Solutions Ltd,
>
> Innovation and Research Centre,
>
> Carriganore,
>
> Waterford, Ireland.
>
>
>
> Phone: +353-86-3863812
>
> Fax:     +353-51-302901
>
> Email:  Cian@CelsusSolutions.com
>
>
>
>
>
> CELSUS SOLUTIONS
>
> Innovation for Mobile Professionals
>
> www.CelsusSolutions.com
>
>
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message