Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 36079 invoked from network); 5 May 2010 14:09:17 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 5 May 2010 14:09:17 -0000 Received: (qmail 82416 invoked by uid 500); 5 May 2010 14:09:15 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 82365 invoked by uid 500); 5 May 2010 14:09:15 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 82357 invoked by uid 99); 5 May 2010 14:09:15 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 05 May 2010 14:09:15 +0000 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of gsiasf@gmail.com designates 209.85.221.171 as permitted sender) Received: from [209.85.221.171] (HELO mail-qy0-f171.google.com) (209.85.221.171) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 05 May 2010 14:09:07 +0000 Received: by qyk1 with SMTP id 1so775215qyk.5 for ; Wed, 05 May 2010 07:08:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:sender:content-type :mime-version:subject:from:in-reply-to:date :content-transfer-encoding:message-id:references:to:x-mailer; bh=qBcU8jQEjGA8dajPueRRdFzQ196CWBA9pNWpr102Loc=; b=pdYQiIXhnX+UJZBBSMsXmbGc8iXXk7Hq9E15ju4bML+KwnZaKeaFe+JwV32SAs63uh zW7/dzhzlJ4NbTFigIoxwPvSJSpQdTpzzvXGSmGz4z0C/+XIHBlaMVxwFM+pjR00Qq7H A8908EFeUCGYKNrgtkWow6PxKxRvNrvLoikpc= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=sender:content-type:mime-version:subject:from:in-reply-to:date :content-transfer-encoding:message-id:references:to:x-mailer; b=Xk+XUNNbMU5lEd2pTPCI1rk88n9Yk1GQtaLctIZG9Px3mWThhEqzkHrwMOW33fQxPm GuTv2SpN3i5+FuDn6I7jHaN+f+GZ8g+r/E/PAe18cl0iqKFITAvr/RjleL7uK08lz0ad dq76vNJDrFDGtlotWPE9EFFz3UdWSzChWuVCQ= Received: by 10.224.79.75 with SMTP id o11mr6386306qak.195.1273068524531; Wed, 05 May 2010 07:08:44 -0700 (PDT) Received: from [10.9.244.35] (72-254-85-150.client.stsn.net [72.254.85.150]) by mx.google.com with ESMTPS id 20sm4487412qyk.0.2010.05.05.07.08.43 (version=TLSv1/SSLv3 cipher=RC4-MD5); Wed, 05 May 2010 07:08:43 -0700 (PDT) Sender: Grant Ingersoll Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Apple Message framework v1078) Subject: Re: Relevancy Practices From: Grant Ingersoll In-Reply-To: Date: Wed, 5 May 2010 07:08:41 -0700 Content-Transfer-Encoding: quoted-printable Message-Id: References: To: java-user@lucene.apache.org X-Mailer: Apple Mail (2.1078) X-Virus-Checked: Checked by ClamAV on apache.org On May 2, 2010, at 5:50 AM, Avi Rosenschein wrote: > On 4/30/10, Grant Ingersoll wrote: >>=20 >> On Apr 30, 2010, at 8:00 AM, Avi Rosenschein wrote: >>> Also, tuning the algorithms to the users can be very important. For >>> instance, we have found that in a basic search functionality, the = default >>> query parser operator OR works very well. But on a page for advanced >>> users, >>> who want to very precisely tune their search results, a default of = AND >>> works >>> better. >>=20 >> Avi, >>=20 >> Great example. Can you elaborate on how you arrived at this = conclusion? >> What things did you do to determine it was a problem? >>=20 >> -Grant >=20 > Hi Grant, >=20 > Sure. On http://wiki.answers.com/, we use search in a variety of > places and ways. >=20 > In the basic search box (what you get if you look stuff up in the main > Ask box on the home page), we generally want the relevancy matching to > be pretty fuzzy. For example, if the user looked up "Where can you see > photos of the Aurora Borealis effect?" I would still want to show them > "Where can you see photos of the Aurora Borealis?" as a match. >=20 > However, the advanced search page, > http://wiki.answers.com/Q/Special:Search, is used by advanced users to > filter questions by various facets and searches, and to them it is > important for the filter to filter out non-matches, since they use it > as a working page. For example, if they want to do a search for "Harry > Potter" and classify all results into the "Harry Potter" category, it > is important that not every match for "Harry" is returned. I'm curious, Avi, if you can share how you came to these conclusions? = For instance, did you have any qualitative evidence that "fuzzy" was = better for the main page? Or was it a "I know it when I see it" kind of = thing. --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org