Return-Path: Delivered-To: apmail-lucene-solr-user-archive@minotaur.apache.org Received: (qmail 43216 invoked from network); 22 Dec 2010 15:32:48 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 22 Dec 2010 15:32:48 -0000 Received: (qmail 7356 invoked by uid 500); 22 Dec 2010 15:32:46 -0000 Delivered-To: apmail-lucene-solr-user-archive@lucene.apache.org Received: (qmail 7226 invoked by uid 500); 22 Dec 2010 15:32:46 -0000 Mailing-List: contact solr-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: solr-user@lucene.apache.org Delivered-To: mailing list solr-user@lucene.apache.org Received: (qmail 7217 invoked by uid 99); 22 Dec 2010 15:32:45 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 22 Dec 2010 15:32:45 +0000 X-ASF-Spam-Status: No, hits=3.0 required=10.0 tests=FORGED_YAHOO_RCVD,FREEMAIL_FROM,RFC_ABUSE_POST,SPF_NEUTRAL,T_TO_NO_BRKTS_FREEMAIL,URI_HEX X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [216.139.236.26] (HELO sam.nabble.com) (216.139.236.26) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 22 Dec 2010 15:32:39 +0000 Received: from ben.nabble.com ([192.168.236.152]) by sam.nabble.com with esmtp (Exim 4.69) (envelope-from ) id 1PVQfq-0004fb-Kr for solr-user@lucene.apache.org; Wed, 22 Dec 2010 07:32:18 -0800 Date: Wed, 22 Dec 2010 07:32:18 -0800 (PST) From: Sebastian M To: solr-user@lucene.apache.org Message-ID: <1293031938630-2131844.post@n3.nabble.com> Subject: Solr Spellcheker automatically tokenizes on period marks MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org Hello, My main (full text) index contains the terms "www", "sometest", "com", which is intended and correct. My spellcheck index contains the term "www.sometest.com". which is also intended and correct. However, when querying the spellchecker using the query "www.sometest.com", I get the suggestion "www.www.sometest.com.com", despite the fact that I'm not using a tokenizer that splits on "." (period marks) as part of my spellcheck query analyzer. When running the Field Analyzer (in the Solr admin page), I can see that even after the last filter (see below), my term text remains "www.sometest.com", which is untokenized, as expected. Any thoughts as to what may be causing this undesired tokenization? To summarize: Main index contains: "www", "sometest", "com" Spellcheck index contains: "www.sometest.com" Spellcheck query: "www.sometest.com" Expected result: (no suggestion) Actual result: "www.www.sometest.com.com" Here is my spellcheck query analyzer: Thank you in advance; any suggestions are welcome! Sebastian -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Spellcheker-automatically-tokenizes-on-period-marks-tp2131844p2131844.html Sent from the Solr - User mailing list archive at Nabble.com.