Return-Path: Delivered-To: apmail-jakarta-lucene-dev-archive@www.apache.org Received: (qmail 46309 invoked from network); 22 Oct 2004 02:12:41 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur-2.apache.org with SMTP; 22 Oct 2004 02:12:41 -0000 Received: (qmail 14710 invoked by uid 500); 22 Oct 2004 02:12:27 -0000 Delivered-To: apmail-jakarta-lucene-dev-archive@jakarta.apache.org Received: (qmail 14517 invoked by uid 500); 22 Oct 2004 02:12:23 -0000 Mailing-List: contact lucene-dev-help@jakarta.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Subscribe: List-Help: List-Post: List-Id: "Lucene Developers List" Reply-To: "Lucene Developers List" Delivered-To: mailing list lucene-dev@jakarta.apache.org Received: (qmail 14493 invoked by uid 99); 22 Oct 2004 02:12:23 -0000 X-ASF-Spam-Status: No, hits=0.1 required=10.0 tests=HTML_50_60,HTML_MESSAGE X-Spam-Check-By: apache.org Received-SPF: neutral (hermes.apache.org: local policy) Received: from [216.235.80.39] (HELO mailadmin.live365.com) (216.235.80.39) by apache.org (qpsmtpd/0.28) with ESMTP; Thu, 21 Oct 2004 19:12:21 -0700 Received: from [193.248.90.228] (account rgabillet@live365.com HELO remisvaio) by mailadmin.live365.com (CommuniGate Pro SMTP 4.1.5) with ESMTP id 10397987; Thu, 21 Oct 2004 19:12:19 -0700 Message-ID: <001401c4b7dc$8058d510$e45af8c1@remisvaio> From: "Nicolas Maisonneuve" To: , "Lucene Users List" Subject: new version of spell checker Date: Fri, 22 Oct 2004 04:11:46 +0200 MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_NextPart_000_0011_01C4B7ED.3EC8B690" X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2800.1437 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1441 X-Virus-Checked: Checked X-Spam-Rating: minotaur-2.apache.org 1.6.2 0/1000/N ------=_NextPart_000_0011_01C4B7ED.3EC8B690 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable UPDATE - sort fixed (the sort was inversed!)=20 - set gram dynamicaly (depending of the length of the word)=20 - use the FuzzyQuery score: ((edit distance)/(length of word)) - new Dictionary interface + LuceneDictionary and PlaintextDictionary = implementation - replace addWords method by indexDictionary(Dictionnary dic) - add a new public method: boolean exist(word)=20 - add a build.xml see the wiki page http://wiki.apache.org/jakarta-lucene/SpellChecker 1 - Could we put the spellchecker to the sandbox.. it'll be easier to = maintain than use Bugzilla/wiki process ? 2 - Jonathan Hager: Could you test this version with our dictionary and = said me the results ? 3 - I search a french dictonary , someone has a URL where i could = download it ? thanks to Jonathan Hager, and Aad Nales for your suggestions / = observations ;-) Nicolas Maisonneuve ------=_NextPart_000_0011_01C4B7ED.3EC8B690--