From java-user-return-31325-apmail-lucene-java-user-archive=lucene.apache.org@lucene.apache.org Mon Dec 03 15:23:59 2007 Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 58751 invoked from network); 3 Dec 2007 15:23:58 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 3 Dec 2007 15:23:58 -0000 Received: (qmail 25403 invoked by uid 500); 3 Dec 2007 15:23:40 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 25373 invoked by uid 500); 3 Dec 2007 15:23:40 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 25362 invoked by uid 99); 3 Dec 2007 15:23:40 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 03 Dec 2007 07:23:40 -0800 X-ASF-Spam-Status: No, hits=2.0 required=10.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of smokeystu@gmail.com designates 64.233.162.228 as permitted sender) Received: from [64.233.162.228] (HELO nz-out-0506.google.com) (64.233.162.228) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 03 Dec 2007 15:23:20 +0000 Received: by nz-out-0506.google.com with SMTP id i28so1902603nzi for ; Mon, 03 Dec 2007 07:23:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to:subject:mime-version:content-type; bh=thcj2ooWGu0GMksOntPvfMxMUygZ0SubY4wr2vZHYx8=; b=UZnjJo0agqB/QN4L1gicjf5PVZ4eifvdc2fRvKOXPBoKj4Twd6JxgDsXo8Y8kNYJ7Rs+5H9cUBLJ5yEMh6k5CY8d3YDCxAqRTkYEyXWMx1mnV+JLW/tnA5Q/RiaeOV9zLfzm4ut1dqasE79v9ZTfQKOSn7qWxcFUTxSVffchm58= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=received:message-id:date:from:to:subject:mime-version:content-type; b=FD5so9wBBgykRd3BGwxC6bIHvy5qZp/B+DimajuwEWhNOHVIi8JRwBa4K/WiPC6sPJtiYUFy2KMGVRJIHx+trib9h1tyekyHfHpzLoK2PSZGvW3SAIDepbnIk1VK3J8i16Mdn6z3hSvLfMqd0jQcygrtQnNb246xtX8duHyW/bs= Received: by 10.142.214.5 with SMTP id m5mr356003wfg.1196695401447; Mon, 03 Dec 2007 07:23:21 -0800 (PST) Received: by 10.143.16.2 with HTTP; Mon, 3 Dec 2007 07:23:21 -0800 (PST) Message-ID: Date: Mon, 3 Dec 2007 10:23:21 -0500 From: smokey To: java-user@lucene.apache.org Subject: SpellChecker performance and usage MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_16328_8813537.1196695401407" X-Virus-Checked: Checked by ClamAV on apache.org ------=_Part_16328_8813537.1196695401407 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline My question is for anyone who has experience with Lucene's SpellChecker, especially around its performance characteristics/ramifications. 1. Given the fact that SpellChecker expands a query by adding all the permutations of potentially misspelled word, how does it perform in general? 2. How are others handling the case where SpellChecker would NOT perform well if you expand the query adding all the permutations? In other words, what kind of techniques are people using to get around or alleviate the performance hit if any? Any sharing of information or pointers would be appreciated. ------=_Part_16328_8813537.1196695401407--