Return-Path: Delivered-To: apmail-lucene-solr-user-archive@minotaur.apache.org Received: (qmail 41130 invoked from network); 9 Apr 2010 12:57:37 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 9 Apr 2010 12:57:37 -0000 Received: (qmail 42139 invoked by uid 500); 9 Apr 2010 12:57:35 -0000 Delivered-To: apmail-lucene-solr-user-archive@lucene.apache.org Received: (qmail 42090 invoked by uid 500); 9 Apr 2010 12:57:35 -0000 Mailing-List: contact solr-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: solr-user@lucene.apache.org Delivered-To: mailing list solr-user@lucene.apache.org Received: (qmail 42082 invoked by uid 99); 9 Apr 2010 12:57:35 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 09 Apr 2010 12:57:35 +0000 X-ASF-Spam-Status: No, hits=2.9 required=10.0 tests=AWL,FREEMAIL_ENVFROM_END_DIGIT,FREEMAIL_FROM,SPF_HELO_PASS,SPF_NEUTRAL,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) Received: from [216.139.236.158] (HELO kuber.nabble.com) (216.139.236.158) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 09 Apr 2010 12:57:29 +0000 Received: from ben.nabble.com ([192.168.236.152]) by kuber.nabble.com with esmtp (Exim 4.63) (envelope-from ) id 1O0Dlk-00024F-0q for solr-user@lucene.apache.org; Fri, 09 Apr 2010 05:57:08 -0700 Date: Fri, 9 Apr 2010 04:57:08 -0800 (PST) From: MitchK To: solr-user@lucene.apache.org Message-ID: <1270817828020-708264.post@n3.nabble.com> In-Reply-To: References: <1270304316019-694867.post@n3.nabble.com> <992CCA59-B2E6-4EF8-A4E2-002C9BF5B558@apache.org> <1270499186366-698683.post@n3.nabble.com> Subject: Re: Minimum Should Match the other way round MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Hoss, before I ran into some missunderstandings, I want to come back to topic first. I will have a look at some classes later, to find out whether some other ideas which are not directly related to this topic (like the multiword-synonyms at query-time) will work or not. I'm sorry for beeing off-topic. Chris Hostetter-3 wrote: > > where the analyzer matters is in creating that numeric field at index time > ... hence my suggestion of having an analyzer chain that exactly matches > the field you are interested in, but ending with a TokenCountingFilter -- > it can take care of creating the "numeric-ish" (padded) field value when > the docs are indexed. > Okay, as I have understood you mean something like this: This fieldType should "store" (or let's say index) the number of tokens as something like "005" for 5 token, right? My problem is that I don't know how to query this field. I know what you mean with appending the query with "Add +titleLen:[* TO MAX_LEN]" - but I don't know how to retrive the MAX_LEN information for a specific query, since it depends in some cases of what an analyzer-chain will be used at the tokenLen-field. For example: I think it makes sense to use a WordDelimiterFilter at the end of my TokenFilter-chain. If my document is something like "The secrets of the iPhone 3G", than I want to index it as "The secrets of the iPhone 3 G" (3G is going to be indexed as two tokens). This means, that the document length is increased by one token. However, maybe I missunderstood your point: "- Pick MAX_LEN Based On Number Of Query Clauses From Super" since I thought, that the number of query clauses depends on the number of whitespaces in my query. If I am wrong, and it depends on the result of my analyzer-chain, there is no problem. But I am not sure, if this is the case or not. Thank you for help. - Mitch -- View this message in context: http://n3.nabble.com/Minimum-Should-Match-the-other-way-round-tp694867p708264.html Sent from the Solr - User mailing list archive at Nabble.com.