From solr-user-return-145075-archive-asf-public=cust-asf.ponee.io@lucene.apache.org Wed Nov 21 05:16:08 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id 9AEA018064E for ; Wed, 21 Nov 2018 05:16:07 +0100 (CET) Received: (qmail 64229 invoked by uid 500); 21 Nov 2018 04:16:05 -0000 Mailing-List: contact solr-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: solr-user@lucene.apache.org Delivered-To: mailing list solr-user@lucene.apache.org Received: (qmail 64215 invoked by uid 99); 21 Nov 2018 04:16:04 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 21 Nov 2018 04:16:04 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 9FCBD188B8E for ; Wed, 21 Nov 2018 04:16:03 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.903 X-Spam-Level: * X-Spam-Status: No, score=1.903 tagged_above=-999 required=6.31 tests=[DKIMWL_WL_MED=-1.46, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, URI_HEX=1.313] autolearn=disabled Authentication-Results: spamd3-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id Y-YsAL23qtnc for ; Wed, 21 Nov 2018 04:16:02 +0000 (UTC) Received: from mail-oi1-f170.google.com (mail-oi1-f170.google.com [209.85.167.170]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id B3E2A5F437 for ; Wed, 21 Nov 2018 04:16:01 +0000 (UTC) Received: by mail-oi1-f170.google.com with SMTP id p82-v6so3442928oih.11 for ; Tue, 20 Nov 2018 20:16:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=lkEcd52IAbb0+OejZmHNafhneybckfKumnNQCVoEtmU=; b=d0SjX79H725PT1SCnl8nyfaTWa75TJQtkqyIYvtBD2bVJrzfJk6PCO9rWLpLOzTrP4 UKCYWNNloUbOaN0+hUxEx94Q5eRWYqatmA+0XFp1WN7yGOUbSWofc0491ffYFKufPg9o 4AqQHq9y3nf4z//IE8WXNDLCNtpadlGHN4vYvl1THeJpCLM5PGzsq+1LrLJZhLBv5gm/ TjjseS9TseI3lniAWlHrGkdXWby+aijQ4FuSdbYrWYwo3zfQuM+Xe45jagNHaFL5z/MB 9g9P+jbDeHJ/y7ef+6FcbyRzR5faR54bK5ZgyOn4j3v4a7oM/jrsbuOkcZA7mv9L7JOB oKyg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=lkEcd52IAbb0+OejZmHNafhneybckfKumnNQCVoEtmU=; b=W1XejbfsOzh12HW9vh6XgVvlvR7iFffWFyZ9iQRV+BItAcQZu8uLyaB4jLEFV6vPES BFbSnroaL50Asc9t3mbcDvGzkY2VtjZNgLJmFbI/2h3/8wKl1RPWxn/VAQYjQtNsHhji G+pV6MWy7zmC8P8K7XADd9fNTskhozNnTHt1QWG7j4qn6CUTyM20TdVO0Eg1bQGBz9DK YAFKuGACPouWg4SHrntjUZZqhLi1kkndBBWQcle/zGp3w8EAPpK4EeIEP/zAyooXey72 16dCMaaFqM0rC1KGhGdaVkbCD8mIL2gbOYVQK2NrJDcnehck0s+y38dMgCIxrt7uTGD1 xGig== X-Gm-Message-State: AGRZ1gKAbBO3r0fhohqrHI9ikWeBbG9d0MRUX09YAZV8uIL/FmRRo7UR 6KtK2fIGohpuBjN4CU2s6isRzOYcsEz0di/n+6QJ3g== X-Google-Smtp-Source: AJdET5fk5/dcJgyYwI7Js8sj/hQYAKPnTQ/NGVi6L+zse/BbXHHgKVcs6bIlLSZC3NQi4ZYZstG7hpUTZZEnuHKvaU0= X-Received: by 2002:aca:a86:: with SMTP id k6mr2808560oiy.334.1542773760336; Tue, 20 Nov 2018 20:16:00 -0800 (PST) MIME-Version: 1.0 References: <1542648330279-0.post@n3.nabble.com> In-Reply-To: <1542648330279-0.post@n3.nabble.com> From: Modassar Ather Date: Wed, 21 Nov 2018 09:45:49 +0530 Message-ID: Subject: Re: Restrict search on term/phrase count in document. To: solr-user@lucene.apache.org Content-Type: multipart/alternative; boundary="000000000000f427c2057b2501ca" --000000000000f427c2057b2501ca Content-Type: text/plain; charset="UTF-8" Thanks for your replies. The requirement is basically to avoid documents which may have a match but with very less number of term or phrase in it. May a be 1/2 matches. The user is interested in those document which has matched term/phrase beyond a certain number. This can be a valid feature/requirement. Best, Modassar On Mon, Nov 19, 2018 at 10:55 PM Alessandro Benedetti wrote: > I agree with Alexandre, it seems suspicious. > Anyway, if you want to query for single term frequencies occurrence you > could make use of the function range query parser : > > > https://lucene.apache.org/solr/guide/6_6/other-parsers.html#OtherParsers-FunctionRangeQueryParser > > And the function: > > termfreq > Returns the number of times the term appears in the field for that > document. > termfreq(text,'memory') > > tf > Term frequency; returns the term frequency factor for the given term, using > the Similarity for the field. The tf-idf value increases proportionally to > the number of times a word appears in the document, but is offset by the > frequency of the word in the document, which helps to control for the fact > that some words are generally more common than others. See also idf. > tf(text,'solr') > > Cheers > > > > ----- > --------------- > Alessandro Benedetti > Search Consultant, R&D Software Engineer, Director > Sease Ltd. - www.sease.io > -- > Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html > --000000000000f427c2057b2501ca--