Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 14040 invoked from network); 23 Mar 2008 23:37:31 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 23 Mar 2008 23:37:31 -0000 Received: (qmail 50390 invoked by uid 500); 23 Mar 2008 23:37:24 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 50349 invoked by uid 500); 23 Mar 2008 23:37:24 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 50332 invoked by uid 99); 23 Mar 2008 23:37:24 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 23 Mar 2008 16:37:24 -0700 X-ASF-Spam-Status: No, hits=2.0 required=10.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of climbingrose@gmail.com designates 64.233.166.183 as permitted sender) Received: from [64.233.166.183] (HELO py-out-1112.google.com) (64.233.166.183) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 23 Mar 2008 23:36:45 +0000 Received: by py-out-1112.google.com with SMTP id z74so2815388pyg.9 for ; Sun, 23 Mar 2008 16:36:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=beta; h=domainkey-signature:received:received:message-id:date:from:to:subject:mime-version:content-type; bh=LBpbsE2F8yXWKhZ7vw7Ueer+2wYf/e+PA0Xtt9L46Gs=; b=k3mbQnzp5ojIIZNMU/6MXdBD4Fxq5xN6cOw0UgrQ9RhTtWPrH4gVTjBBD6CgjLwSLlqWulyALf8gSPj+gbj0+hT3weWLU3xPu52jl2s6rrL0nRKjg3sRoGTkKvyjHsGYGAeB6n+VZa0iSJxpy18SQVmraF4ps5qCLk5ADvpLArU= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=message-id:date:from:to:subject:mime-version:content-type; b=VgKKNiqyhA/N6TQNU2/QyrCPNtDiZ/bZJz9IZtxWJU2mHoz7U7A7niq6FfNWD7A+yHq9zWimgII+BCMLtkWRDzfdJro72VvmKQjfTHdMDBUBgYEtBrcqn8flZYh3HdzCAH91JKGB1pSXNW650OI9EQuvCwDOHfN9XTWR8arKhUE= Received: by 10.141.202.12 with SMTP id e12mr1906357rvq.169.1206315416157; Sun, 23 Mar 2008 16:36:56 -0700 (PDT) Received: by 10.140.250.7 with HTTP; Sun, 23 Mar 2008 16:36:56 -0700 (PDT) Message-ID: <3e7716cd0803231636i39befe0bw45c127780c0f30b2@mail.gmail.com> Date: Mon, 24 Mar 2008 10:36:56 +1100 From: climbingrose To: java-user Subject: Implement a relaxed PhraseQuery? MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_4807_19609761.1206315416145" X-Virus-Checked: Checked by ClamAV on apache.org ------=_Part_4807_19609761.1206315416145 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline Hi all, I posted this in Solr mailing but then I thought it would be more appropriate to have it here. I thought many people would encounter the situation I'm having here. Basically, we'd like to have a PhraseQuery with "minimum should match" property similar to BooleanQuery. Consider the query "Senior Java Developer": 1) I'd like to do a PhraseQuery on "Senior Java Developer" with a slop of say 2, so that the query only matches documents with these words located in proximity. I don't want to match documents like "Senior Java Developer". 2) I also want to relax PhraseQuery a bit so that it not only match "Senior Java Developer"~2 but also matches "Java Developer"~2 but of course with a lower score. I can programmatically generate on the combination but it's not gonna be efficient if user issues query with many terms. It looks like the only solution is to hack PhraseScorer and its subclasses. Has anyone done this before? If yes, please share your experience. -- Regards, Cuong Hoang ------=_Part_4807_19609761.1206315416145--