lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Miller" <>
Subject Re: commercial websites powered by Lucene?
Date Tue, 24 Jun 2003 16:15:49 GMT
Thanks for the pointers, and rest assurred we are looking into such
approaches. However the data that we have is coming in from a wide variety
of customers and is unfortunately not nearly as structured as we would like
(and we are powerless to change that). So while we do have some fields that
are database-friendly, a large portion of what we have to search against is
plain text, which is why I'm looking into Lucene as a possible solution. To
be honest I'm still a bit torn between using a database or Lucene for
searching since the data we have falls into the grey area between the two.
Once we have decent proof-of-concepts up and running of each approach I
guess a clearer picture will emerge.

I'm not clear on why you think we'll soon be back up to the same load on the
DB server? What is going to increase the load? Our volume of data is not
increasing, all that will change is that the DB will no longer get hit for
searches. We'll still be pulling content etc from the database at roughly
the same rate, but that doesn't appear to be a source of any problems.
Whether we offload the searching to MySQL DBs or Lucene makes no difference
as far as I can see.

> Well, nothing against Lucene, but it doesn't solve your problem, which
> is an overloaded DB-Server. It may temporarily alleviate the effects,
> but you'll soon be at the same load again. So I'd recommend to install
> additional databases (MySQL comes to mind), which contain duplicates of
> your data, but in a form that is customized to your searches. Then do
> the searches on these databases and use the SQL Server merely as a
> storage backend and definitive data source.
> What makes searches complex in databases are usually joins. It is
> therefore a good idea to join only once (i.e. at data creation time) and
> then copy the aggregated data in a flat form into a search database.
> That is basically what you are doing with Lucene right now, but Lucene
> is a full-text indexer, it is geared towards unstructured data. If your
> data is already in a database in a structured form, it doesn't make much
> sense IMHO to use Lucene.
> Of course, in real life there may be political obstacles which will
> prevent you from doing the right thing as detailed above for example,
> and your only chance is to circumvent in some way - and then Lucene is a
> great way to do that. But keep in mind that you are basically
> reinventing the functionality that is already built-in in a database :)
> Ulrich

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message