Return-Path: X-Original-To: apmail-lucene-solr-user-archive@minotaur.apache.org Delivered-To: apmail-lucene-solr-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 34802DD25 for ; Fri, 27 Jul 2012 09:47:23 +0000 (UTC) Received: (qmail 2029 invoked by uid 500); 27 Jul 2012 09:47:20 -0000 Delivered-To: apmail-lucene-solr-user-archive@lucene.apache.org Received: (qmail 1906 invoked by uid 500); 27 Jul 2012 09:47:19 -0000 Mailing-List: contact solr-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: solr-user@lucene.apache.org Delivered-To: mailing list solr-user@lucene.apache.org Received: (qmail 1878 invoked by uid 99); 27 Jul 2012 09:47:18 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 27 Jul 2012 09:47:18 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of tech178@yoox.com designates 195.130.217.15 as permitted sender) Received: from [195.130.217.15] (HELO service92.mimecast.com) (195.130.217.15) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 27 Jul 2012 09:47:10 +0000 Received: from Ymail11blq.yoox.net (relay3.yoox.com [81.208.106.4]) (Using TLS) by service92.mimecast.com; Fri, 27 Jul 2012 10:46:49 +0100 Received: from YMAIL02BLQ.yoox.net (10.127.103.78) by Ymail11blq.yoox.net (10.127.103.71) with Microsoft SMTP Server (TLS) id 14.1.218.12; Fri, 27 Jul 2012 11:46:48 +0200 Received: from YMAIL13BLQ.yoox.net ([169.254.2.80]) by YMAIL02BLQ.yoox.net ([fe80::101:6984:aea0:aae3%17]) with mapi id 14.01.0355.002; Fri, 27 Jul 2012 11:46:47 +0200 From: Finotti Simone To: "solr-user@lucene.apache.org" Subject: Re: Skip first word Thread-Topic: Skip first word Thread-Index: Ac1qfxjURy/VT/HYRD+Ya0bgkqD9Zf//4EEAgAEYHx2AAIBWAIABNbpn///kCQCAACi3hw== Date: Fri, 27 Jul 2012 09:46:47 +0000 Message-ID: References: ,<1343232620.20860.YahooMailClassic@web121703.mail.ne1.yahoo.com> , ,<01ADFE28-570B-4AAD-85A1-C499B863986A@it-agenten.com> In-Reply-To: <01ADFE28-570B-4AAD-85A1-C499B863986A@it-agenten.com> Accept-Language: it-IT, en-US Content-Language: it-IT X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.127.100.64] MIME-Version: 1.0 X-MC-Unique: 112072710464908301 Content-Type: text/plain; charset=WINDOWS-1252 Content-Transfer-Encoding: quoted-printable Brilliant!=0AThank you very much :)=0A=0A__________________________________= ______=0AInizio: Chantal Ackermann [c.ackermann@it-agenten.com]=0AInviato: = venerd=EC 27 luglio 2012 11.20=0AFine: solr-user@lucene.apache.org=0AOggett= o: Re: Skip first word=0A=0AHi Simone,=0A=0Ano I meant that you populate th= e two fields with the same input - best done via copyField directive.=0A=0A= The first field will contain ngrams of size 1 and 2. The other field will c= ontain ngrams of size 3 and longer (you might want to set a decent maxsize = there).=0A=0AThe query for the autocomplete list uses the first field when = the input (typed in by the user) is one or two characters long. Your exampl= e was: "D", "G", or than "Do" or "Ga". The result would search only on the = single token field that contains for the input "Dolce & Gabbana" only the n= grams "D" and "Do". So, only the input "D" or "Do" would result in a hit on= "Dolce & Gabbana".=0AOnce the user has typed in the third letter: "Dol" or= "Gab", you query the second, more tokenized field which would contain for = "Dolce & Gabbana" the ngrams "Dol" "Dolc" "Dolce" "Gab" "Gabb" "Gabba" etc.= =0ABoth inputs "Gab" and "Dol" would then return "Dolce & Gabbana".=0A=0A1.= First field type:=0A=0A=0A=0A=0A2. Secong field type:=0A=0A=0A=0A=0A=0A3. field declarations:=0A=0A=0A=0A=0A=0A=0A=0AChantal=0A=0AAm 27.07.2012 um 11:05 schrieb Fin= otti Simone:=0A=0A> Hi Chantal,=0A>=0A> if I understand correctly, this imp= lies that I have to populate different fields according to their lenght. Si= nce I'm not aware of any logical condition you can apply to copyField direc= tive, it means that this logic has to be implementend by the process that p= opulates the Solr core. Is this assumption correct?=0A>=0A> That's kind of = bad, because I'd like to have this kind of "rules" in the Solr configuratio= n. Of course, if that's the only way... :)=0A>=0A> Thank you=0A>=0A> ______= __________________________________=0A> Inizio: Chantal Ackermann [c.ackerma= nn@it-agenten.com]=0A> Inviato: gioved=EC 26 luglio 2012 18.32=0A> Fine: so= lr-user@lucene.apache.org=0A> Oggetto: Re: Skip first word=0A>=0A> Hi,=0A>= =0A> use two fields:=0A> 1. KeywordTokenizer (=3D single token) with ngram = minsize=3D1 and maxsize=3D2 for inputs of length < 3,=0A> 2. the other one = tokenized as appropriate with minsize=3D3 and longer for all longer inputs= =0A>=0A>=0A> Cheers,=0A> Chantal=0A>=0A>=0A> Am 26.07.2012 um 09:05 schrieb= Finotti Simone:=0A>=0A>> Hi Ahmet,=0A>> business asked me to apply EdgeNGr= am with minGramSize=3D1 on the first term and with minGramSize=3D3 on the l= atter terms.=0A>>=0A>> We are developing a search suggestion mechanism, the= idea is that if the user types "D", the engine should suggest "Dolce & Gab= bana", but if we type "G", it should suggest other brands. Only if users ty= pe "Gab" it should suggest "Dolce & Gabbana".=0A>>=0A>> Thanks=0A>> S=0A>> = ________________________________________=0A>> Inizio: Ahmet Arslan [iorixxx= @yahoo.com]=0A>> Inviato: mercoled=EC 25 luglio 2012 18.10=0A>> Fine: solr-= user@lucene.apache.org=0A>> Oggetto: Re: Skip first word=0A>>=0A>>> is ther= e a tokenizer and/or a combination of filter to=0A>>> remove the first term= from a field?=0A>>>=0A>>> For example:=0A>>> The quick brown fox=0A>>>=0A>= >> should be tokenized as:=0A>>> quick=0A>>> brown=0A>>> fox=0A>>=0A>> Ther= e is no such filter that i know of. Though, you can implement one with modi= fying source code of LengthFilterFactory or StopFilterFactory. They both re= move tokens. Out of curiosity, what is the use case for this?=0A>>=0A>>=0A>= >=0A>>=0A>=0A>=0A>=0A>=0A>=0A=0A=0A=0A