forrest-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Johannes Schaefer <>
Subject lucene search: workaround for site.pdf/html
Date Mon, 30 Aug 2004 09:38:33 GMT

Lucene search doesn't work if site.xml contains entries
for site.pdf or site.html (<all> section). To have a
workaround we put these two entries into a separate file
(we call it "Printversion"):

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V1.2//EN"">
       ... <link href="site.html">Full HTML</link> ...
       ... <link href="site.pdf">Full PDF</link> ...

This works fine and gives us some room to explain what these
two links are used for. Lucene doesn't follow the links in
the file, so lucene can create the index without problems.

Just one question. What is better: to put "site:html" in the
file or "site:full_html"?


User Interface Design GmbH * Teinacher Str. 38 * D-71634 Ludwigsburg
Fon +49 (0)7141 377 000 * Fax  +49 (0)7141 377 00-99
Geschäftsstelle: User Interface Design GmbH * Lehrer-Götz-Weg 11 * 
D-81825 München

View raw message