Return-Path: Delivered-To: apmail-jakarta-lucene-user-archive@apache.org Received: (qmail 93326 invoked from network); 24 Jun 2003 13:40:23 -0000 Received: from exchange.sun.com (192.18.33.10) by daedalus.apache.org with SMTP; 24 Jun 2003 13:40:23 -0000 Received: (qmail 8173 invoked by uid 97); 24 Jun 2003 13:42:45 -0000 Delivered-To: qmlist-jakarta-archive-lucene-user@nagoya.betaversion.org Received: (qmail 8166 invoked from network); 24 Jun 2003 13:42:44 -0000 Received: from daedalus.apache.org (HELO apache.org) (208.185.179.12) by nagoya.betaversion.org with SMTP; 24 Jun 2003 13:42:44 -0000 Received: (qmail 92561 invoked by uid 500); 24 Jun 2003 13:40:10 -0000 Mailing-List: contact lucene-user-help@jakarta.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Subscribe: List-Help: List-Post: List-Id: "Lucene Users List" Reply-To: "Lucene Users List" Delivered-To: mailing list lucene-user@jakarta.apache.org Received: (qmail 92472 invoked from network); 24 Jun 2003 13:40:09 -0000 Received: from main.gmane.org (80.91.224.249) by daedalus.apache.org with SMTP; 24 Jun 2003 13:40:09 -0000 Received: from list by main.gmane.org with local (Exim 3.35 #1 (Debian)) id 19Uo1b-0005Sc-00 for ; Tue, 24 Jun 2003 15:39:55 +0200 X-Injected-Via-Gmane: http://gmane.org/ To: lucene-user@jakarta.apache.org Received: from news by main.gmane.org with local (Exim 3.35 #1 (Debian)) id 19Uo07-0005LN-00 for ; Tue, 24 Jun 2003 15:38:23 +0200 From: Ulrich Mayring Subject: Re: commercial websites powered by Lucene? Date: Tue, 24 Jun 2003 15:36:25 +0200 Lines: 44 Message-ID: References: <85256D3B.004CC107.00@corpnj148ls01.mcgraw-hill.com> <001401c32b38$32aa2440$d501a8c0@naderit> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Complaints-To: usenet@main.gmane.org User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.1) Gecko/20020826 X-Accept-Language: de-de, en-us, en Sender: news X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N Chris Miller wrote: > > Fair enough, I haven't tried much in the way of profiling yet. I just > thought you might have found some Lucene settings that made a big difference > for you, or you'd found indexing into a RAMDirectory then dumping it to disk > was faster, etc. But it sounds like you're pretty happy with near default > settings. Yes, definitely. > Our current DB server (running SQL Server) is under enormous strain, partly > due to the complex searches that are being performed against it. We've got > it pretty heavily tweaked already, so I don't think there's too much room to > improve on that front. The idea is to use Lucene to take the searching load > off it so it can get on with all the other tasks it has to perform. The > Lucene implementation I'm working on here is just a proof of concept - it > may be that we stay with SQL Server in the long run anyway, but Lucene > definitely seems to be worth investigating - it has certainly worked well > for us on smaller projects. Well, nothing against Lucene, but it doesn't solve your problem, which is an overloaded DB-Server. It may temporarily alleviate the effects, but you'll soon be at the same load again. So I'd recommend to install additional databases (MySQL comes to mind), which contain duplicates of your data, but in a form that is customized to your searches. Then do the searches on these databases and use the SQL Server merely as a storage backend and definitive data source. What makes searches complex in databases are usually joins. It is therefore a good idea to join only once (i.e. at data creation time) and then copy the aggregated data in a flat form into a search database. That is basically what you are doing with Lucene right now, but Lucene is a full-text indexer, it is geared towards unstructured data. If your data is already in a database in a structured form, it doesn't make much sense IMHO to use Lucene. Of course, in real life there may be political obstacles which will prevent you from doing the right thing as detailed above for example, and your only chance is to circumvent in some way - and then Lucene is a great way to do that. But keep in mind that you are basically reinventing the functionality that is already built-in in a database :) Ulrich --------------------------------------------------------------------- To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org For additional commands, e-mail: lucene-user-help@jakarta.apache.org