Return-Path: X-Original-To: apmail-incubator-lucy-dev-archive@www.apache.org Delivered-To: apmail-incubator-lucy-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 1BA187ECE for ; Sun, 21 Aug 2011 19:35:31 +0000 (UTC) Received: (qmail 93492 invoked by uid 500); 21 Aug 2011 19:35:30 -0000 Delivered-To: apmail-incubator-lucy-dev-archive@incubator.apache.org Received: (qmail 93427 invoked by uid 500); 21 Aug 2011 19:35:30 -0000 Mailing-List: contact lucy-dev-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: lucy-dev@incubator.apache.org Delivered-To: mailing list lucy-dev@incubator.apache.org Received: (qmail 93419 invoked by uid 99); 21 Aug 2011 19:35:29 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 21 Aug 2011 19:35:29 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of moritz@faui2k3.org designates 213.95.10.24 as permitted sender) Received: from [213.95.10.24] (HELO casella.faui2k3.org) (213.95.10.24) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 21 Aug 2011 19:35:23 +0000 Received: from cl-238.cgn-01.de.sixxs.net ([2001:4dd0:ff00:ed::2]) by casella.faui2k3.org with esmtpsa (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.72) (envelope-from ) id 1QvDnR-0002n3-0j for lucy-dev@incubator.apache.org; Sun, 21 Aug 2011 21:35:01 +0200 Message-ID: <4E515DE2.5010601@faui2k3.org> Date: Sun, 21 Aug 2011 21:34:58 +0200 From: Moritz Lenz User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.16) Gecko/20110702 Icedove/3.0.11 MIME-Version: 1.0 To: "lucy-dev@incubator.apache.org" X-Enigmail-Version: 1.0.1 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-SA-Exim-Connect-IP: 2001:4dd0:ff00:ed::2 X-SA-Exim-Mail-From: moritz@faui2k3.org X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on casella.faui2k3.org X-Spam-Level: X-SA-Exim-Version: 4.2.1 (built Mon, 22 Mar 2010 06:26:47 +0000) X-SA-Exim-Scanned: Yes (on casella.faui2k3.org) X-Old-Spam-Status: No, score=-2.9 required=5.0 tests=ALL_TRUSTED,BAYES_00 autolearn=ham version=3.3.1 Subject: [lucy-dev] Automagic phrase search Hi, in the past I've often wondered why "home made" search engines on websites (often even on really big sites) feel much inferior compared to the big names in search business like Google, Yahoo, Bing and DuckDuckGo. One of my conclusions is that Google & co. usually treat the order of search terms as an important indicator for relevance, while most other search engines don't. For example if you enter the words 'search engine' (without the quotes), you'll mostly get exact matches matches first, as if you had really searched for '"search engine"'. Google goes even further: if you search for a number of words in a row, it ranks documents higher that have most but not necessarily all of the search words in the right order next to each other, even if there are one or two other words in between. Long rambling, short message: I'd love to have a mechanism in lucy to provide an automagic phrase search as above, which honors the order of search words even outside an explicit phrase search, and less restrictively than an explicit phrase search. Is something like that already implemented, and if no, is it on any agenda? I know far too little about Lucy's internal workings to know if that's easy or even possible, but for me it would be a real killer feature. If somebody points me the direction where to start I might even give it a try, though my C fu is mediocre at best. Cheers, Moritz