Return-Path: Delivered-To: apmail-lucene-solr-user-archive@minotaur.apache.org Received: (qmail 1270 invoked from network); 26 Aug 2010 17:17:27 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 26 Aug 2010 17:17:27 -0000 Received: (qmail 5876 invoked by uid 500); 26 Aug 2010 17:17:24 -0000 Delivered-To: apmail-lucene-solr-user-archive@lucene.apache.org Received: (qmail 5462 invoked by uid 500); 26 Aug 2010 17:17:23 -0000 Mailing-List: contact solr-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: solr-user@lucene.apache.org Delivered-To: mailing list solr-user@lucene.apache.org Received: (qmail 5036 invoked by uid 99); 26 Aug 2010 17:17:23 -0000 Received: from Unknown (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 26 Aug 2010 17:17:23 +0000 X-ASF-Spam-Status: No, hits=3.5 required=10.0 tests=FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL,URI_HEX X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of erickerickson@gmail.com designates 209.85.216.176 as permitted sender) Received: from [209.85.216.176] (HELO mail-qy0-f176.google.com) (209.85.216.176) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 26 Aug 2010 17:17:01 +0000 Received: by qyk2 with SMTP id 2so2338284qyk.14 for ; Thu, 26 Aug 2010 10:16:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:content-type; bh=iFQNkvkDQ7oNkDMvsinTzwXfzzdG1ytnOhHYc75dOSQ=; b=h5lJcZq5MDv8ynxnoVyeYyQ8usIVI1UX4bjZW/cbOeRgh18oF4fhzrRQepB63tcEpo V/2FfSb+l0pF8A2WFeFsvAt6IS2b3N6RNNCE4xyjjkXIA9mfsVrrenzkeiwFksNLP6ic toGXzhoKR7LEVxMsANk8aAOngYEdbtxny9Ulo= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=HdMV5MPf6bIWpnkN9J75LyixEi7+HlsVOCP+7c+HuTCbil575uSVLswbOxcCUy55Fp RlIUBIYQGgK/F3pXEWIuivBWDVSn1txF6GyOm1W7QKRLtuRJckLVNwa2hReVYgTQSXF9 WGMblQ+jwXAUCjqn5PZDLEw99cMFsNJXgarAI= MIME-Version: 1.0 Received: by 10.229.191.135 with SMTP id dm7mr7415012qcb.29.1282842996372; Thu, 26 Aug 2010 10:16:36 -0700 (PDT) Received: by 10.229.186.83 with HTTP; Thu, 26 Aug 2010 10:16:36 -0700 (PDT) In-Reply-To: <1282832640292-1353350.post@n3.nabble.com> References: <1282832640292-1353350.post@n3.nabble.com> Date: Thu, 26 Aug 2010 10:16:36 -0700 Message-ID: Subject: Re: Matching exact words From: Erick Erickson To: solr-user@lucene.apache.org Content-Type: multipart/alternative; boundary=00163628452c1a1c04048ebd29d4 X-Virus-Checked: Checked by ClamAV on apache.org --00163628452c1a1c04048ebd29d4 Content-Type: text/plain; charset=ISO-8859-1 You'll have to change your index I'm afraid. The problem is that all the index sees is the stemmed version (assuming you're stemming at index time). There's no information in the index about what the original version was, so it's impossible to back this out. One solution is to use copyfield to make a copy of the input that does NOT stem, and search against (or boost) that field when you care about stemmed/unstemmed. And a minor clarification. The "types" you refer to aren't really a SOLR entity. They are just a convenient collection of tokenizers and stemmers that are provided in the schema file. You can freely create your own types by simply mixing and matching various varieties of these (you probably already know this, but the phrasing of your question caused me to wonder). See: http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters Best Erick On Thu, Aug 26, 2010 at 7:24 AM, ahammad wrote: > > Hello, > > I have a case where if I search for the word "windows", I get results > containing both "windows" and "window" (and probably other things like > "windowing" etc.). Is there a way to find exact matches only? > > The field in which I am searching is a text field, which as I understand > causes this behaviour. I cannot use a string field because it is very > restricted, but what else can be done? I understand there are other types > of > text fields that are more strict than the standard field. > > Ideally I would like to keep my index the way it is, with the ability to > force exact matches. For example, if I can search "windows -window" or > something like that, that would be great. Or if I can wrap my query in a > set > of quotes to tell it to match exactly. I've seen that done before but I > cannot get it to work. > > As a reference, here is my query: > > q={!boost b=$db v=$qq > > defType=$sh}&qq=windows&db=recip(ms(NOW,lastModifiedLong),3.16e-11,1,1)&sh=dismax > > To be quite frank, I am not very familiar with this syntax. I am just using > whatever my old coworker left behind. > > Any tips on how to find exact matches or improve the above query will be > greatly appreciated. > > Thanks > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Matching-exact-words-tp1353350p1353350.html > Sent from the Solr - User mailing list archive at Nabble.com. > --00163628452c1a1c04048ebd29d4--