Return-Path: X-Original-To: apmail-lucene-solr-user-archive@minotaur.apache.org Delivered-To: apmail-lucene-solr-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 74A0719FE6 for ; Mon, 18 Apr 2016 09:57:23 +0000 (UTC) Received: (qmail 67821 invoked by uid 500); 18 Apr 2016 09:57:19 -0000 Delivered-To: apmail-lucene-solr-user-archive@lucene.apache.org Received: (qmail 67755 invoked by uid 500); 18 Apr 2016 09:57:19 -0000 Mailing-List: contact solr-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: solr-user@lucene.apache.org Delivered-To: mailing list solr-user@lucene.apache.org Received: (qmail 67743 invoked by uid 99); 18 Apr 2016 09:57:19 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 18 Apr 2016 09:57:19 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id A73C71A0354 for ; Mon, 18 Apr 2016 09:57:18 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.129 X-Spam-Level: ** X-Spam-Status: No, score=2.129 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd2-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx2-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id 9uXJ0TfbWiwK for ; Mon, 18 Apr 2016 09:57:17 +0000 (UTC) Received: from mail-yw0-f172.google.com (mail-yw0-f172.google.com [209.85.161.172]) by mx2-lw-eu.apache.org (ASF Mail Server at mx2-lw-eu.apache.org) with ESMTPS id AE2075F257 for ; Mon, 18 Apr 2016 09:57:16 +0000 (UTC) Received: by mail-yw0-f172.google.com with SMTP id o66so196908186ywc.3 for ; Mon, 18 Apr 2016 02:57:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to; bh=T+ASuofhn1CO8A6qcwSm+I3Zv9K5gpOgVIuokdLOG7A=; b=uK7JFNMZwdG8+b2VrGuQBSuGh+wSWwSff9+39kYOAXtgGRTQuAAsNd0zm5/7HeRDkK RV5QH15f435QGK+1/EJzZJIyeZGp3J+OSN+zQumHZcgwEcbYBljj1H/aKnU/KByvfIrO CCrk6HqWwfMlGdS178sScxqOEiXv0YjsEbJmD7mA/zbJXwV0xaOEdHAr4Z9wh4CwSBKb ptzz1yX23WSOaMFBw+kWNl4MgSc28/o7Hq3ZRpvtK010rtO40Orm4RQYNTUuJAwCev+z dW1u98hf8tpIn7Dz6Jquqnmfae9z+PTpoHB9+hYNoBWCn5KP5gc8sH3GGYC23ZxUtpa5 /xzA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to; bh=T+ASuofhn1CO8A6qcwSm+I3Zv9K5gpOgVIuokdLOG7A=; b=CLhKiiXDjY0+cpJgR4I0oE+WBSP5lZnVy8N2JTHDNyitjZscXQ2LyTsdCnMfNPXccY DLciNTHLcXFPpmPDXjrJDKIhIF+zIFTRSCcsEZ8sO5NXGbafkHiMtB0npU85mv/s2xT3 uDF0Qv/QFAUUfoJogKQHJT6pHOcICZbOD9n8I1k9C4xxVcfoQfyAWD7sWo4ZE0tTM8Je RYp1V/QLLnqXdmo8Jt4hTgQrN7I62dlIAuGPbNs+NqZ9fpQy3z5xEEztRPvCs8QscH5z HQlNHkfQVEdRvKs31eN3g0TOMLFdZ/gfNWzx7VrSaUwLjQTTiAQg4A0tO2GaMya0oCaO zDUg== X-Gm-Message-State: AOPr4FU2B7PwBYmCYnS1t0q7ARRg86Uieccg5ye/YffDEItNo/Mcgf6iXxR9jtUB9z9RUciYC2IagIos6YlSZg== MIME-Version: 1.0 X-Received: by 10.129.89.214 with SMTP id n205mr3849597ywb.146.1460973430286; Mon, 18 Apr 2016 02:57:10 -0700 (PDT) Received: by 10.13.208.1 with HTTP; Mon, 18 Apr 2016 02:57:10 -0700 (PDT) In-Reply-To: References: Date: Mon, 18 Apr 2016 15:27:10 +0530 Message-ID: Subject: Re: Wildcard query behavior. From: Modassar Ather To: solr-user@lucene.apache.org Content-Type: multipart/alternative; boundary=001a114912ac56b4aa0530bf62d9 --001a114912ac56b4aa0530bf62d9 Content-Type: text/plain; charset=UTF-8 Thanks Reth for your response. When validator is changed to validate, both at query time and index time, then should not validator*/validator return the same results at-least? E.g. 5 documents contains validator. At index time validator got changed to validate. Now when validator* is searched it will also change to validate and should match all 5 documents. In this case I am not sure how the wildcard internally is handled meaning what the query will transform to. Please help me understand the internals of wildcard with stemming or point me to some documents as I could not find any details on it. Best, Modassar On Mon, Apr 18, 2016 at 1:04 PM, Reth RM wrote: > If you search for f:validat*, then I believe you will get same number of > results. Please check. > > f:validator* is searching for records that have prefix "validator" where as > field with stemmer which stems "validator" to "validate" (if this stemming > was applied at index time as well as query time) its looking for records > that have "validate" or "validator", so for obvious reasons, numFound might > have been different. > > > > On Mon, Apr 18, 2016 at 12:48 PM, Modassar Ather > wrote: > > > Hi, > > > > Please help me understand following. > > > > I have analysis chain which uses KStemFilterFactory for a field. Solr > > version is 5.4.0 > > > > When I search for f:validator I get 80K+ documents whereas if I search > for > > f:validator* I get only around 150 results. > > > > When I checked on analysis page I see that validator is changed to > > validate. Per my understanding in both the above cases it should at-least > > give the exact same result of around 80K+ documents. > > > > I understand in some cases wildcards can result in sub-optimal results > for > > stemmed content. Please correct me if I am wrong. > > > > Thanks, > > Modassar > > > --001a114912ac56b4aa0530bf62d9--