lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Trattnig <>
Subject Search environment: the best choice
Date Thu, 16 Feb 2006 12:20:51 GMT

I've following constellation (planned architecture):

[Webserver - APACHE]
which serves the content

[unspecified other servers]

[CMS Server / SearchEngine - TOMCAT]
handles the content creation and publishing to the webserver
indexing of content stored at the apache-machine

The tomcat-machine should index the APACHE and maybe some other servers by
cronjob. search requests from the webserver are forwarded to the search
engine at the tomcat-machine.

Indexing HTML-files have priority - PDF, Word and stuff like that would be
very nice.

1.) Which search engine (means lucene implementation) would be the best
choice for such a situation? In other words: what's the difference?
          - Lucene
          - Nutch

2.) Are there other search engines which are better for solving this issue?
3.) Do I have to write my own indexer (which is parsing html, pdf...) or are
there usefull templates/indexers available?
4.) Does anybody know a free alternative (for commercial use) to Zilverline


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message