Return-Path: Delivered-To: apmail-lucene-solr-user-archive@minotaur.apache.org Received: (qmail 13887 invoked from network); 11 Sep 2009 05:53:59 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 11 Sep 2009 05:53:59 -0000 Received: (qmail 80197 invoked by uid 500); 11 Sep 2009 05:53:57 -0000 Delivered-To: apmail-lucene-solr-user-archive@lucene.apache.org Received: (qmail 80105 invoked by uid 500); 11 Sep 2009 05:53:57 -0000 Mailing-List: contact solr-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: solr-user@lucene.apache.org Delivered-To: mailing list solr-user@lucene.apache.org Received: (qmail 80095 invoked by uid 99); 11 Sep 2009 05:53:57 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 11 Sep 2009 05:53:57 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of czambran@gmail.com designates 209.85.211.191 as permitted sender) Received: from [209.85.211.191] (HELO mail-yw0-f191.google.com) (209.85.211.191) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 11 Sep 2009 05:53:48 +0000 Received: by ywh29 with SMTP id 29so1267473ywh.23 for ; Thu, 10 Sep 2009 22:53:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from :user-agent:mime-version:to:subject:content-type :content-transfer-encoding; bh=7rPq4DFrXbw8dYrhbfpGQtvNoL0QwDq/E/hrzdtWvX0=; b=bQ199lVOqEp2iyOqDdVvzGghkTVCXIfl+P3okWTnkW5hve/2tsggZ66DAD0A2IgCCl 7oLOZz1SOj9V75wbaDbh3eaCk14e8yiprfFhur6IJMiJKtT2FwQtRiFVfpCxvFBoshro 7MyrWxBwPhQk6W6w1QP0MLefi2PRSwJkcaEbo= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:user-agent:mime-version:to:subject :content-type:content-transfer-encoding; b=ILAvdWI2I0LdsLXxGsY+qXqPFsm1Oca6zzpfxhWL3Z5NbV+iFiBbWjlw5fxVvFCE/s B7RDlzlYfWyzVPEI1exsFTETwgZyFj8xwYrPXe7tioP6Rjq4b4Pr4/WkoeH9nKDq2m5J G27reQEG5dQ/ANHcgV9PVPexE3JVwXkJmckMc= Received: by 10.90.226.13 with SMTP id y13mr1412374agg.107.1252648407587; Thu, 10 Sep 2009 22:53:27 -0700 (PDT) Received: from ?192.168.1.103? (173-27-220-139.client.mchsi.com [173.27.220.139]) by mx.google.com with ESMTPS id 11sm3616743aga.34.2009.09.10.22.53.26 (version=TLSv1/SSLv3 cipher=RC4-MD5); Thu, 10 Sep 2009 22:53:27 -0700 (PDT) Message-ID: <4AA9E5D6.7030809@gmail.com> Date: Fri, 11 Sep 2009 00:53:26 -0500 From: Christian Zambrano User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.1) Gecko/20090715 Thunderbird/3.0b3 MIME-Version: 1.0 To: solr-user@lucene.apache.org Subject: What Tokenizerfactory/TokenFilterFactory can/should I use so a search for "wal mart" matches "walmart"(quotes not included in search or index)? Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org There are a lot of company names that people are uncertain as to the correct spelling. A few of examples are: 1. best buy, bestbuy 2. walmart, wal mart, wal-mart 3. Holiday Inn, HolidayInn What Tokenizer Factory and/or TokenFilterFactory should I use so that somebody typing "wal mart"(quotes not included) will find "wal mart" and "walmart"(again, quotes not included) Thanks, Christian