lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ulrich Mayring <>
Subject Re: commercial websites powered by Lucene?
Date Tue, 24 Jun 2003 13:36:25 GMT
Chris Miller wrote:
> Fair enough, I haven't tried much in the way of profiling yet. I just
> thought you might have found some Lucene settings that made a big difference
> for you, or you'd found indexing into a RAMDirectory then dumping it to disk
> was faster, etc. But it sounds like you're pretty happy with near default
> settings.

Yes, definitely.

> Our current DB server (running SQL Server) is under enormous strain, partly
> due to the complex searches that are being performed against it. We've got
> it pretty heavily tweaked already, so I don't think there's too much room to
> improve on that front. The idea is to use Lucene to take the searching load
> off it so it can get on with all the other tasks it has to perform. The
> Lucene implementation I'm working on here is just a proof of concept - it
> may be that we stay with SQL Server in the long run anyway, but Lucene
> definitely seems to be worth investigating - it has certainly worked well
> for us on smaller projects.

Well, nothing against Lucene, but it doesn't solve your problem, which 
is an overloaded DB-Server. It may temporarily alleviate the effects, 
but you'll soon be at the same load again. So I'd recommend to install 
additional databases (MySQL comes to mind), which contain duplicates of 
your data, but in a form that is customized to your searches. Then do 
the searches on these databases and use the SQL Server merely as a 
storage backend and definitive data source.

What makes searches complex in databases are usually joins. It is 
therefore a good idea to join only once (i.e. at data creation time) and 
then copy the aggregated data in a flat form into a search database. 
That is basically what you are doing with Lucene right now, but Lucene 
is a full-text indexer, it is geared towards unstructured data. If your 
data is already in a database in a structured form, it doesn't make much 
sense IMHO to use Lucene.

Of course, in real life there may be political obstacles which will 
prevent you from doing the right thing as detailed above for example, 
and your only chance is to circumvent in some way - and then Lucene is a 
great way to do that. But keep in mind that you are basically 
reinventing the functionality that is already built-in in a database :)


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message