Return-Path: X-Original-To: apmail-lucene-solr-user-archive@minotaur.apache.org Delivered-To: apmail-lucene-solr-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id D808610722 for ; Fri, 27 Dec 2013 17:13:11 +0000 (UTC) Received: (qmail 22745 invoked by uid 500); 27 Dec 2013 17:12:51 -0000 Delivered-To: apmail-lucene-solr-user-archive@lucene.apache.org Received: (qmail 22127 invoked by uid 500); 27 Dec 2013 17:12:46 -0000 Mailing-List: contact solr-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: solr-user@lucene.apache.org Delivered-To: mailing list solr-user@lucene.apache.org Received: (qmail 22075 invoked by uid 99); 27 Dec 2013 17:12:43 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 27 Dec 2013 17:12:43 +0000 X-ASF-Spam-Status: No, hits=1.3 required=5.0 tests=RCVD_IN_DNSWL_NONE,SPF_PASS,URI_HEX X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [98.138.91.32] (HELO nm5-vm1.bullet.mail.ne1.yahoo.com) (98.138.91.32) by apache.org (qpsmtpd/0.29) with SMTP; Fri, 27 Dec 2013 17:12:35 +0000 Received: from [98.138.100.116] by nm5.bullet.mail.ne1.yahoo.com with NNFMP; 27 Dec 2013 17:12:14 -0000 Received: from [98.138.89.160] by tm107.bullet.mail.ne1.yahoo.com with NNFMP; 27 Dec 2013 17:12:14 -0000 Received: from [127.0.0.1] by omp1016.mail.ne1.yahoo.com with NNFMP; 27 Dec 2013 17:12:14 -0000 X-Yahoo-Newman-Property: ymail-3 X-Yahoo-Newman-Id: 431640.11681.bm@omp1016.mail.ne1.yahoo.com Received: (qmail 25909 invoked by uid 60001); 27 Dec 2013 17:12:14 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1388164334; bh=y9mmoDF72jhvL8pDU79p9652HBCh/q/5rrLge0grHU0=; h=X-YMail-OSG:Received:X-Rocket-MIMEInfo:X-Mailer:References:Message-ID:Date:From:Reply-To:Subject:To:In-Reply-To:MIME-Version:Content-Type:Content-Transfer-Encoding; b=wHtuIkZeMWweLmftKmuDJHqNkyJ0wynlEEqPVNAM7dD+An2dxE6agEsiijYavrLZD8aHriz34iUbzxGHfh8eA0C8AYgGdav4OCNYKwfYY7O8K6NBgGpDNykergTxAOfoTynjTIqjmFiLLsQCvavzGmJBgRsXj9KxQFmK/g99B6c= DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=X-YMail-OSG:Received:X-Rocket-MIMEInfo:X-Mailer:References:Message-ID:Date:From:Reply-To:Subject:To:In-Reply-To:MIME-Version:Content-Type:Content-Transfer-Encoding; b=ezUwLMdm/FarhhSY89QFpr3YrWL6fBx53kz83bKesRfmlB3H1jw/amF7/4tsnhTWvy9/R9j6nT/rr4eG9qX3h5q3LVBeJUd7kIzw2rjhl5GKgVe2k7/Y3wvBR7WomMY7joGS0HthxJ1pCntuZ6aXXw76Edsz+NpMhKuQauDu2VA=; X-YMail-OSG: 3V4usakVM1kVwwvHC7AH1r5KnHbc0kI4dyrFZNQ1ttjlio5 MKqc.X.PG0pQu2T2HaTtD4UF.0KYqKOfvXAFgNvsE5u05rDrj2XP7RNSGoE_ Tqs0LoNpR3h9zRrAMxWemULuaRgj4dmE5rVXpm3stTcAm5M.A.2Z4qhtqNfz 7Nz2SIxfsxArL_dsvEDWlY1xMov.D2aZM_vpazUsW0g2knvGWv8YPw3y5sQn pBn7oZVRJiXpxGdPQVoVQED2SX58ztDpBeVPMWsbsBOlKPikM0p4tgZhC.oE gGilNQ8epTuZb3XCtcWZOrZO7UoxQoNNM07REosxedLlm3pvFEe3mhfe1xC3 davwVBfyfeia_OxRjOYCyMAMeAj.HULuafg4QEQcPZaj2xIMCdRMVmjK0XCN ZFfwn1Shcspd2uQQydnkZrsgfYRRBohqr5nF_Avd3K.GDGhsoVgW_VJoJLV5 FuBeg4M9A0zecUtUlkaWqtDVz3tKiZ96IwMnqVYVorRyztuB3m9c5HLx85NP upa6Dx3.u4OilDl967cleM13Z4CAUIsSgXlfFVM2IBB8E5IvE4i8L.3szTa9 RPIpGHKpFeKP_W8Kc_AsGK57uGIcM6Y4l8s3aI95H.4YKmjuwYRmTXM.DDXl xLfnaU52K2lU27yIJA6GGRDmRMVT6V1uj6MNZLh7Tg72q46b.xU0aXD3Dg7i UM0B.OaUGvb.V_hhbz0rCb75ykMaAWGglD4p8u8.5HSBsbVHLT7xchCQHSO. Xp1guTnUSMSGdjz7W1Mvrthlu6ALnlHRGhtFH4pXE.XTPiZo48eqwtNw1cPR wDV4kldE_HCBVNdsX9j7PB981Hk7Wr6_yvo56jetDba77YLT0i2mG0MSACmb MTbXZ7xEFnff0Agj5yR1FJSSSocJF0tQrxflw3DjYdtl4q3Zs3gmvMMBwPFR l9U_MYPQ7_xCJ54hxHNUQ4UMH7VLHWujU_cb7A_eALayj3sriuTe0PdYffcg aDZLtn2_c1.190vwPnhA- Received: from [78.167.20.115] by web125305.mail.ne1.yahoo.com via HTTP; Fri, 27 Dec 2013 09:12:14 PST X-Rocket-MIMEInfo: 002.001,SGkgSGF5YSwKClllcyB5b3UgYXJlIGNvcnJlY3QswqAibXlOYW1lPWFhYSBiYmIiIHdpbGwgcHJvZHVjZSBpbmRleCB0ZXJtczogIm15TmFtZSIsICJhYWEiLCAiYmJiIi4gWW91IGNhbiB2ZXJpZnkgdGhpcyBhdCBhZG1pbiBhbmFseXNpcyBwYWdlLiBZb3UgY2FuIHRlc3QgeW91ciBhbmFseXplciBieSBlbnRlcmluZyBzYW1wbGUgdGV4dCBpbiDCoGFuIHVzZXIgaW50ZXJmYWNlLsKgCllvdXIgcXVlcnkgIm15TmFtZSBhYWEiIHdpbGwgYmUgYSBQaHJhc2UgUXVlcnkgYW5kIHdpbGwgbWF0Y2ggd2l0aCBhYm8BMAEBAQE- X-Mailer: YahooMailWebService/0.8.172.614 References: <1387739478532-4107795.post@n3.nabble.com> <1388013462.15876.YahooMailNeo@web125304.mail.ne1.yahoo.com> <173401388117907@web1g.yandex.ru> Message-ID: <1388164334.7531.YahooMailNeo@web125305.mail.ne1.yahoo.com> Date: Fri, 27 Dec 2013 09:12:14 -0800 (PST) From: Ahmet Arslan Reply-To: Ahmet Arslan Subject: Re: Solr - Match whole word only in text fields To: "solr-user@lucene.apache.org" In-Reply-To: <173401388117907@web1g.yandex.ru> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org Hi Haya,=0A=0AYes you are correct,=A0"myName=3Daaa bbb" will produce index = terms: "myName", "aaa", "bbb". You can verify this at admin analysis page. = You can test your analyzer by entering sample text in =A0an user interface.= =A0=0AYour query "myName aaa" will be a Phrase Query and will match with ab= ove settings.=0AYour query=A0"myName bbb" won't match.=0A=0Ahttp://lucene.a= pache.org/core/4_6_0/queryparser/org/apache/lucene/queryparser/classic/pack= age-summary.html#Proximity_Searches=0A=0AIt is better to give it a try.=A0= =0A=0AAhmet=0A=0A=0AOn Friday, December 27, 2013 6:18 AM, Kydryavtsev Andre= y wrote:=0AHi everybody!=0A=0AAhmet, do I get it corre= ct - if I use this text_char_norm field type, for input "myName=3Daaa bbb" = I'll index terms "myName", "aaa", "bbb"? So I'll match with query like "myN= ame" or query like=A0 "bbb", but not match with "myName aaa". I can use thi= s type for query value, so split "myName aaa" into ( "myName" && "aaa") - a= nd it will work. But this approach will give false positive match with "myN= ame bbb". What do you think, how I can handle this? One of the=A0 approache= s is to use in this field type KeywordTokenizer+ShingleFilter instead of Wh= itespaceTokenizerFactory, so tokens like "myName", "myName aaa", "myName aa= a bbb", "aaa", "aaa bbb", "bbb" will be indexed, but it significantly incre= ased index size in case of long values. =0A=0A=0A26.12.2013, 03:20, "Ahmet = Arslan" :=0A> Hi Haya,=0A>=0A> With MappingCharFilter yo= u can have full control over character set that you want to split.=0A>=0A> = in mappings.txt you will have=0A>=0A> ":" =3D> " "=0A> "=3D" =3D> " "=0A>= =0A> Use the following type and see if it suits for your needs. Update mapp= ings.txt according to your needs.=0A>=0A> =A0=A0=A0 =0A> =A0= =A0=A0=A0=A0 =0A> =A0=A0=A0=A0=A0=A0=A0 =0A> =A0=A0=A0=A0=A0= =A0=A0 =0A> =A0=A0=A0= =A0=A0=A0=A0 =0A> =A0=A0=A0= =A0=A0 =0A> =A0=A0=A0 =0A>=0A> On Sunday, December 2= 2, 2013 9:19 PM, haya.axelrod wrote:=0A> I have a = text field that can contain very long values (like text files). I=0A> want = to create field type for it (text, not string), in order to have=0A> someth= ing like "Match whole word only" in notepad++, but the delimiter=0A> should= not be only white spaces. If i have:=0A>=0A> myName=3Daaa bbb=0A>=0A> I wo= uld like to get it for the following search strings "aaa", "bbb", "aaa=0A> = bbb", "myName=3Daaa bbb", "myName", but not for "aa" or "ame=3Da" or "a bb"= .=0A> Another example is:=0A>=0A> aaa bbb=0A> Can i do thi= s somehow?=0A>=0A> What should be my field type definition?=0A>=0A> The tex= t can contain any character. Before search i'm escaping the search=0A> stri= ng using=0A> http://lucene.apache.org/solr/4_2_1/solr-solrj/org/apache/solr= /client/solrj/util/ClientUtils.html=0A>=0A> Thanks=0A>=0A> --=0A> View this= message in context: http://lucene.472066.n3.nabble.com/Solr-Match-whole-wo= rd-only-in-text-fields-tp4107795.html=0A> Sent from the Solr - User mailing= list archive at Nabble.com.