lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yonik Seeley" <ysee...@gmail.com>
Subject Re: Nutch? Solr? Red-piranha?
Date Tue, 28 Mar 2006 17:48:56 GMT
On 3/28/06, Michael Levy <michael@michaelrlevy.com> wrote:
> I'm looking for advice on selecting a search application.  I'm
> responsible for developing a new search platform for use in a historical
> research organization and museum.  I've pretty much decided on Lucene as
> the library for custom servlet apps that would use the Lucene API directly.
>
> At the same time, we have a number of application ideas that could
> probably use a flexible/customizable off-the-shelf crawling
> application.  For starters, it would be pretty basic stuff like indexing
> PDF files using some library, returning links that have been translated
> to point to a Tomcat virtual directory containing the files.  But our
> apps could quickly get more complex as we think of new search ideas.
>
> I'm having a hard time comparing and contrasting Nutch, Solr, and
> Red-piranha.  I would appreciate anyone offering your ideas or
> experiences about which of these (or any other comprehensive search
> solutions) are good for which types of applications.  TIA!

I'd use Solr for highly structured data (documents with multiple
fields) and very configurable text analysis per-field. Think of it
more like a database, but designed for full-text search. We are
working on adding easy faceted browsing and indexing of SQL databases.

I'd use Nutch for web-search (a free google replacement): crawling
(discovering), indexing web pages, and automatically handling
different types of human readable documents like HTML, PDF, etc.

I don't have any experience with Red-piranha.

Example use cases as I see it:
To index a music web site with lots of articles: Nutch
To index a music collection (structured data like title, album,
author, year, genre, etc): Solr

-Yonik
http://incubator.apache.org/solr Solr, The Open Source Lucene Search Server

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message