Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 813FB200BE2 for ; Thu, 15 Dec 2016 17:58:29 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 7FDAF160B15; Thu, 15 Dec 2016 16:58:29 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id C6DAC160B13 for ; Thu, 15 Dec 2016 17:58:28 +0100 (CET) Received: (qmail 71907 invoked by uid 500); 15 Dec 2016 16:58:27 -0000 Mailing-List: contact solr-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: solr-user@lucene.apache.org Delivered-To: mailing list solr-user@lucene.apache.org Received: (qmail 71895 invoked by uid 99); 15 Dec 2016 16:58:26 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 15 Dec 2016 16:58:26 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 6ABFDC0258 for ; Thu, 15 Dec 2016 16:58:26 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 3.13 X-Spam-Level: *** X-Spam-Status: No, score=3.13 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, HTML_MESSAGE=2, KAM_LIVE=1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd4-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id IT3KQYuAttru for ; Thu, 15 Dec 2016 16:58:25 +0000 (UTC) Received: from mail-wm0-f51.google.com (mail-wm0-f51.google.com [74.125.82.51]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id D82125F4E7 for ; Thu, 15 Dec 2016 16:58:24 +0000 (UTC) Received: by mail-wm0-f51.google.com with SMTP id g23so173497507wme.1 for ; Thu, 15 Dec 2016 08:58:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=xkZuvekzlQ3H1BqGWuDeSBdIsLqp9sNbke/DJ5osvBA=; b=S6/+XLADYsOt48Qujab2rgBiN7ECrlSQVPvoe2CZTb6s2v6X2MSRvVxm4hSkM1xFct 9qI/5EZkoTd56x7BBgSDcBMAzf+i8CqlXAE9M9urIgP42hHJ9ZC7EU7DCtChAqoIlrOG 60t+bxvOy6oedxz3KUUEabdL6vFZB0X3SZSIpBl+Wi4LlG3v+ClsHfCNzM/cB7COh1wf eYh2SkYtRRedWC85JA+VOTIZDzfH3ie2gIQjUbtlDESvlZMJdgzZ3LkFAfNpFIkHLEsv DJYE8OKb1aoCrtvopErSW4vptq8QoxiU2d6PS24lHc05OZT3/a8lZXEItCeUBNMBl9pB QODQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=xkZuvekzlQ3H1BqGWuDeSBdIsLqp9sNbke/DJ5osvBA=; b=J5JC0PSm4DTBY8g04MnAjWgev1ssnTvRCP27LAuXs8mZtfu9lyLgLDS0xBXLl3Poqc 6OmJbDMGK0gCBaO5xtjFmHOml0r34Sfi/7Tm6cBMOpQ3tTG0r0aZimd7/sWkdVs/Xz9i z2fo7D8rfHeLtk3IbyLLccSs3Z4a32a4j4xzXAZtYzxSn/T3URMRuvHsYIL9DIotkU82 Iutp6MWcihVJC9HuS8qOG1ODDzncnZfS/BBT1VEaHq56tiPr8Hl1zCphRpLoV7Y2ia/7 GeBa8O6EOEs6HvI+mhGSd4BB/1a1/bTvV5No5k3Bf+gIew3LCk8xS3jt9lXHAttl3HPJ fyAQ== X-Gm-Message-State: AIkVDXKbxc+yNzV7jpmguyQn38sqruh021eTwsEyA6q1m0Eumyskw2Do39asySJ3ZxxCRytfO8DifsqSqD9HEw== X-Received: by 10.28.54.3 with SMTP id d3mr2337438wma.34.1481821100911; Thu, 15 Dec 2016 08:58:20 -0800 (PST) MIME-Version: 1.0 Received: by 10.80.164.214 with HTTP; Thu, 15 Dec 2016 08:58:20 -0800 (PST) In-Reply-To: <1523896656.3380064.1481819560509@mail.yahoo.com> References: <1523896656.3380064.1481819560509@mail.yahoo.com> From: Susheel Kumar Date: Thu, 15 Dec 2016 11:58:20 -0500 Message-ID: Subject: Re: Stemming with SOLR To: solr-user@lucene.apache.org, Ahmet Arslan Content-Type: multipart/alternative; boundary=001a114363c65732d50543b55c2f archived-at: Thu, 15 Dec 2016 16:58:29 -0000 --001a114363c65732d50543b55c2f Content-Type: text/plain; charset=UTF-8 We did extensive comparison in the past for Snowball, KStem and Hunspell and there are cases where one of them works better but not other or vice-versa. You may utilise all three of them by having 3 different fields (fieldTypes) and during query, search in all of them. For some of the cases where none of them works (e.g wolves, wolf etc)., use StemOverriderFactory. HTH. Thanks, Susheel On Thu, Dec 15, 2016 at 11:32 AM, Ahmet Arslan wrote: > Hi, > > KStemFilter returns legitimate English words, please use it. > > Ahmet > > > > On Thursday, December 15, 2016 6:17 PM, Lasitha Wattaladeniya < > wattale@gmail.com> wrote: > Hello devs, > > I'm trying to develop this indexing and querying flow where it converts the > words to its original form (lemmatization). I was doing bit of research > lately but the information on the internet is very limited. I tried using > hunspellfactory but it doesn't convert the word to it's original form, > instead it gives suggestions for some words (hunspell works for some > english words correctly but for some it gives multiple suggestions or no > suggestions, i used the en_us.dic provided by openoffice) > > I know this is a generic problem in searching, so is there anyone who can > point me to correct direction or some information :) > > Best regards, > Lasitha Wattaladeniya > Software Engineer > > Mobile : +6593896893 > Blog : techreadme.blogspot.com > --001a114363c65732d50543b55c2f--