From lucene-dev-return-2865-qmlist-jakarta-archive-lucene-dev=jakarta.apache.org@jakarta.apache.org Thu Jan 02 16:19:40 2003 Return-Path: Delivered-To: apmail-jakarta-lucene-dev-archive@apache.org Received: (qmail 32242 invoked from network); 2 Jan 2003 16:19:39 -0000 Received: from exchange.sun.com (192.18.33.10) by daedalus.apache.org with SMTP; 2 Jan 2003 16:19:39 -0000 Received: (qmail 18465 invoked by uid 97); 2 Jan 2003 16:20:56 -0000 Delivered-To: qmlist-jakarta-archive-lucene-dev@jakarta.apache.org Received: (qmail 18446 invoked by uid 97); 2 Jan 2003 16:20:56 -0000 Mailing-List: contact lucene-dev-help@jakarta.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Subscribe: List-Help: List-Post: List-Id: "Lucene Developers List" Reply-To: "Lucene Developers List" Delivered-To: mailing list lucene-dev@jakarta.apache.org Received: (qmail 18434 invoked by uid 98); 2 Jan 2003 16:20:55 -0000 X-Antivirus: nagoya (v4218 created Aug 14 2002) Message-ID: <979C96743601BE4D88E7273E0C7648F303939ED6@HOTMAIL1.hq.ny.hotj.net> From: "Shah, Vineel" To: 'Lucene Developers List' Subject: RE: custom scoring api questions Date: Thu, 2 Jan 2003 11:19:23 -0500 MIME-Version: 1.0 X-Mailer: Internet Mail Service (5.5.2655.55) Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01C2B27A.B645D7E0" X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N ------_=_NextPart_001_01C2B27A.B645D7E0 Content-Type: text/plain; charset="iso-8859-1" It's not a bad thought. We're using Oracle and usually would just do a SQL query and let Oracle indices take care of the searching. However: 1. One of the fields is a clob with >16k of text per entry. We're using Oracle's Context, which has proven unreliable and slow. 2. We have to search on a normalized data structure. Each parent row may have 10-100 child rows.There may be up to 200,000 parent rows. There may be 60 query terms to look for in the child rows. In my inherited codebase, the query does 60 joins against the child table for each parent. Needless to say, the web page times out before the search is done. Our users are understandably frustrated. And so, it seems worthwhile to use a seperate search engine and sync the database contents to it. I looked at Lucene because it is open source, in java, low overhead, and fast. So far, I'm extremely pleased with the results! vineel -----Original Message----- From: Leo Galambos [mailto:galambos@com-os2.ms.mff.cuni.cz] Sent: Tuesday, December 31, 2002 5:16 PM To: Lucene Developers List Subject: Re: custom scoring api questions On Mon, 30 Dec 2002, Shah, Vineel wrote: > I've been developing a search function with Lucene for a couple of weeks > (it's wonderful!) I've run into a snag-- the way I need to calculate > scores seems to have nothing to do with Lucene's scoring paradigm. I > think this is because I'm doing a database-oriented search instead of a > document-oriented one. Isn't it better to use RDBMS with B+? I am not sure if a fulltext module is a good way... Just a thought... :) -g- -- To unsubscribe, e-mail: For additional commands, e-mail: ------_=_NextPart_001_01C2B27A.B645D7E0--