Return-Path: X-Original-To: apmail-lucene-solr-user-archive@minotaur.apache.org Delivered-To: apmail-lucene-solr-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 0882C11455 for ; Fri, 16 May 2014 20:58:31 +0000 (UTC) Received: (qmail 44501 invoked by uid 500); 16 May 2014 14:21:38 -0000 Delivered-To: apmail-lucene-solr-user-archive@lucene.apache.org Received: (qmail 39905 invoked by uid 500); 16 May 2014 14:21:35 -0000 Mailing-List: contact solr-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: solr-user@lucene.apache.org Delivered-To: mailing list solr-user@lucene.apache.org Received: (qmail 76722 invoked by uid 99); 16 May 2014 14:14:55 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 16 May 2014 14:14:55 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=SPF_HELO_PASS,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of solr@elyograg.org designates 166.70.79.219 as permitted sender) Received: from [166.70.79.219] (HELO frodo.elyograg.org) (166.70.79.219) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 16 May 2014 14:14:47 +0000 Received: from localhost (localhost [127.0.0.1]) by frodo.elyograg.org (Postfix) with ESMTP id A8EE244E3 for ; Fri, 16 May 2014 08:14:26 -0600 (MDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=elyograg.org; h= content-transfer-encoding:content-type:content-type:in-reply-to :references:subject:subject:mime-version:user-agent:from:from :date:date:message-id:received:received; s=mail; t=1400249666; bh=XCuhD4NxkirKjxJkG+rVccHw+oBbR3+sTbz3sDPD14M=; b=RGNsrb7pkHNU 9Hpha8R8+xRet73VO7mrGxHZslqC4BYg1NPklhEJj/xOphANV0QGOkfzSQll4IjA hLU5QpV2bYk1rVyFU5vRQx05Y6DUax5VprzM6LzBFpRu4bLuwRsqaQQFY/ndqcvl SPE/+DQHiBIHvsmOpPgiEaH+/ovvjLo= X-Virus-Scanned: Debian amavisd-new at frodo.elyograg.org Received: from frodo.elyograg.org ([127.0.0.1]) by localhost (frodo.elyograg.org [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id T4fpiaTiO3jv for ; Fri, 16 May 2014 08:14:26 -0600 (MDT) Received: from [192.168.1.105] (105.int.elyograg.org [192.168.1.105]) (using TLSv1 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) (Authenticated sender: elyograg@elyograg.org) by frodo.elyograg.org (Postfix) with ESMTPSA id 3F6CA4491 for ; Fri, 16 May 2014 08:14:26 -0600 (MDT) Message-ID: <53761D46.5030201@elyograg.org> Date: Fri, 16 May 2014 08:14:30 -0600 From: Shawn Heisey User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64; rv:24.0) Gecko/20100101 Thunderbird/24.5.0 MIME-Version: 1.0 To: solr-user@lucene.apache.org Subject: Re: Difference between search strings References: <1400073330043-4135571.post@n3.nabble.com> In-Reply-To: <1400073330043-4135571.post@n3.nabble.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org On 5/14/2014 7:15 AM, nativecoder wrote: > Can someone please tell me the difference between searching a text in the > following ways > > 1. q=Exact_Word:"samplestring" -> What does it tell to solr ? > > 2. q=samplestring&qf=Exact_Word -> What does it tell to solr ? > > 3. q="samplestring"&qf=Exact_Word -> What does it tell to solr ? > > I think the first and the third one are the same. is it correct ? How does > it differ from the second one. > > I am trying to understand how enclosing the full term in "" is resolving the > solr specific special character problem? What does it tell to solr ? e.g If > there is "!" mark in the string solr will identify it as a NOT, "!" is part > of the string. This issue can be corrected if the full string is enclosed in > a "". Quotes surrounding a Solr query turn it into a phrase query. For fields where the entire text is a single token, this becomes an exact match. For tokenized fields, it means that term positions in the index and the query will be compared -- so the query terms will need to be next to each other and in that specific order in the indexed data. Your first and third examples should parse the same, although the third one only works with the dismax and edismax parsers. The first one would work correctly with the standard parser and the edismax parser, but not the dismax parser. Quotes will *also* eliminate the need to escape characters that would normally require backslash escaping. For single-token fields where you're doing exact match, quotes will also preserve spaces in the query. If you need an actual quote character to be in your query, it needs to be escaped. As for the problem you are having with the exclamation point -- the Solr analaysis page indicates that KeyWordTokenizer does *not* split on exclamation points. The only thing I am aware of that uses exclamation points for splitting is explicit document routing in SolrCloud. If the field you are using is the uniqueKey for your index and you are running SolrCloud, then text before an exclamation point is used for document routing. Note: You should not use a solr.TextField type for your uniqueKey field, that should be solr.StrField. If you use solr.StrField, then you cannot have an analysis chain with a tokenizer, so any possible confusion about what KeywordTokenizer does would disappear. Thanks, Shawn