Return-Path: X-Original-To: apmail-lucene-java-user-archive@www.apache.org Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 66446EB51 for ; Wed, 21 Nov 2012 17:05:34 +0000 (UTC) Received: (qmail 98652 invoked by uid 500); 21 Nov 2012 17:05:31 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 98601 invoked by uid 500); 21 Nov 2012 17:05:31 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 98584 invoked by uid 99); 21 Nov 2012 17:05:31 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 21 Nov 2012 17:05:31 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of SRS0=YCzr+i=JR=basetechnology.com=jack@yourhostingaccount.com designates 65.254.253.27 as permitted sender) Received: from [65.254.253.27] (HELO mailout03.yourhostingaccount.com) (65.254.253.27) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 21 Nov 2012 17:05:23 +0000 Received: from mailscan16.yourhostingaccount.com ([10.1.15.16] helo=mailscan16.yourhostingaccount.com) by mailout03.yourhostingaccount.com with esmtp (Exim) id 1TbDjS-0005jV-Lc for java-user@lucene.apache.org; Wed, 21 Nov 2012 12:05:02 -0500 Received: from impout01.yourhostingaccount.com ([10.1.55.1] helo=impout01.yourhostingaccount.com) by mailscan16.yourhostingaccount.com with esmtp (Exim) id 1TbDjS-0002If-8u for java-user@lucene.apache.org; Wed, 21 Nov 2012 12:05:02 -0500 Received: from authsmtp03.yourhostingaccount.com ([10.1.18.3]) by impout01.yourhostingaccount.com with NO UCE id SH521k00M03yUm201H52HQ; Wed, 21 Nov 2012 12:05:02 -0500 X-Authority-Analysis: v=2.0 cv=EJGEIilC c=1 sm=1 a=yH02RjTyxywMAIqhn74x1Q==:17 a=aQzbgH187woA:10 a=3jZET7lWBKwA:10 a=IkcTkHD0fZMA:10 a=jvYhGVW7AAAA:8 a=6fQwlqsUbaUA:10 a=mV9VRH-2AAAA:8 a=8BzY99dkAAAA:8 a=GBy4D2PWQV4JOXhahlcA:9 a=QEXdDO2ut3YA:10 a=maIvl2Yd+fJND/e85XqkCw==:117 X-EN-OrigOutIP: 10.1.18.3 X-EN-IMPSID: SH521k00M03yUm201H52HQ Received: from 207-237-113-14.c3-0.nyr-ubr1.nyr.ny.cable.rcn.com ([207.237.113.14] helo=JackKrupansky) by authsmtp03.yourhostingaccount.com with esmtpa (Exim) id 1TbDjO-0005nL-Tt for java-user@lucene.apache.org; Wed, 21 Nov 2012 12:04:59 -0500 Message-ID: From: "Jack Krupansky" To: References: <50ACDC05.40709@gmail.com> In-Reply-To: <50ACDC05.40709@gmail.com> Subject: Re: Which stemmer? Date: Wed, 21 Nov 2012 12:04:37 -0500 MIME-Version: 1.0 Content-Type: text/plain; format=flowed; charset="utf-8"; reply-type=response Content-Transfer-Encoding: 8bit X-Priority: 3 X-MSMail-Priority: Normal Importance: Normal X-Mailer: Microsoft Windows Live Mail 15.4.3555.308 X-MimeOLE: Produced By Microsoft MimeOLE V15.4.3555.308 X-EN-UserInfo: e0a4b55451ed9f27313ebf02e3d4348d:fc4a93e1349e680c52bdd723c0ab3ef6 X-EN-AuthUser: jack@basetechnology.com Sender: "Jack Krupansky" X-EN-OrigIP: 207.237.113.14 X-EN-OrigHost: 207-237-113-14.c3-0.nyr-ubr1.nyr.ny.cable.rcn.com X-Virus-Checked: Checked by ClamAV on apache.org Great! For my favorite example of "invest", "invests", etc. it shows: SnowballEnglish: •investment •invest •invests •investing •invested kStem: •investors •invest •investor •invests •investing •invested minimalStem:invest •invest •invests That highlights the distinctions between these stemmers quite well, without highlighting the actual indexed term, which can be quite ugly. -- Jack Krupansky -----Original Message----- From: Elmer van Chastelet Sent: Wednesday, November 21, 2012 8:49 AM To: java-user@lucene.apache.org Subject: Re: Which stemmer? I've just created a small web application which you might find useful. You can see which words are matched by a query word when using different analyzers (phonetic and stemming analyzers). These include snowball, kstem and minimal stem (the ones on the right). http://dutieq.st.ewi.tudelft.nl/wordsearch/ I can extend the app with more analyzers. Please let me know :) --Elmer Example On 11/14/2012 07:55 PM, Scott Smith wrote: > Does anyone have any experience with the stemmers? I know that Porter is > what "everyone" uses. Am I better off with KStemFilter (better > performance) or ?? Does anyone understand the differences between the > various stemmers and how to choose one over another? > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org