Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 862A2200AE4 for ; Fri, 24 Jun 2016 18:25:53 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 84C9B160A58; Fri, 24 Jun 2016 16:25:53 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id CCD73160A2E for ; Fri, 24 Jun 2016 18:25:52 +0200 (CEST) Received: (qmail 31869 invoked by uid 500); 24 Jun 2016 16:25:51 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 31855 invoked by uid 99); 24 Jun 2016 16:25:50 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 24 Jun 2016 16:25:50 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id CD284C0290 for ; Fri, 24 Jun 2016 16:25:49 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -2.247 X-Spam-Level: X-Spam-Status: No, score=-2.247 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RP_MATCHES_RCVD=-1.426, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd4-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=yahoo.com Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id rieHUvx2A9eB for ; Fri, 24 Jun 2016 16:25:45 +0000 (UTC) Received: from nm18.bullet.mail.ne1.yahoo.com (nm18.bullet.mail.ne1.yahoo.com [98.138.90.81]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id 9F0A05F571 for ; Fri, 24 Jun 2016 16:25:45 +0000 (UTC) Received: from [98.138.100.115] by nm18.bullet.mail.ne1.yahoo.com with NNFMP; 24 Jun 2016 16:25:39 -0000 Received: from [98.138.89.169] by tm106.bullet.mail.ne1.yahoo.com with NNFMP; 24 Jun 2016 16:25:39 -0000 Received: from [127.0.0.1] by omp1025.mail.ne1.yahoo.com with NNFMP; 24 Jun 2016 16:25:39 -0000 X-Yahoo-Newman-Property: ymail-3 X-Yahoo-Newman-Id: 559869.28861.bm@omp1025.mail.ne1.yahoo.com X-YMail-OSG: aGjxVnIVM1l1S7W4WRO3XuT3rxmiXn1JKS6JKVauIJztVuqCXO.WQ7yRpe3zKR2 EObCzmfmCuSFGRRsnAZtyM4SElpg9GASHeVrZUVmn.yWbzSb0Ynp71Q_6R.vH_l03sn8NVqjWuEn RH7tNGTWjsIDU1luQjTiaf.KvC84liJT8NjSnlgT8VvdeXnjvoi1mzgJxNpjs5BzYXWYXXan.gHP 5llt13R8B5homzK0MVMCneydCVzn587vrBVygFNRGVGpgHpTKaKP__g6FeHL5NM3zmw4CdNkNH_R Xb514RuAhgCeWYIqaMALYfKOqQqnnnqrna0lYaZUQTnMJCLqp2QEPyV1KbcQcgz6cc2rCBHyfMBj cB5hLTmR.2y7r6i2s3cH9TLEIeQI_gRDC8eKs0P98INeCczWB9.RCjJYtGk__zK71VZ1nYIY5M3S d_nCH5LQqmBHrby8xe0k5O_tyWR0HhQeJyF.rbVzfX8of7cKKNoIWlmiLLwYixlSXPEv4m15Zwww gsP2x Received: from jws100190.mail.ne1.yahoo.com by sendmailws123.mail.ne1.yahoo.com; Fri, 24 Jun 2016 16:25:39 +0000; 1466785539.034 Date: Fri, 24 Jun 2016 16:25:38 +0000 (UTC) From: Ahmet Arslan Reply-To: Ahmet Arslan To: "java-user@lucene.apache.org" Message-ID: <834856302.824847.1466785538592.JavaMail.yahoo@mail.yahoo.com> In-Reply-To: <576D3E83.6020400@wolfram.com> References: <576D3E83.6020400@wolfram.com> Subject: Re: Favoring Terms Occurring in Close Proximity MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit archived-at: Fri, 24 Jun 2016 16:25:53 -0000 Hi Daniel, You can add optional clauses to your query for boosting purposes. for example, temperate OR climates OR "temperate climates"~5^100 ahmet On Friday, June 24, 2016 5:07 PM, Daniel Bigham wrote: Something significant that I've noticed about using the default Lucene query parser is that if your user enters a query like: "temperate climates" ... it will get turned into an OR query: temperate OR climates This means that a document that contains the literal substring "temperate climates" will be on equal footing with a document that contains "temperate emotions may go a long way to keeping the peace as we continue to discuss climate change". So far as I know, your typical search engine definitely does not ignore the relative positions of terms. And so my question is -- how do people typically deal with this when using Lucene? What is wanted is a query that desires search terms to be close together, but failing that, is ok with the terms simply occurring in the document. And again -- the ultimate desire isn't just to construct a Query object to accomplish that, but to hook things up in such a way that a user can enter a query in an input box and have the system take their flat string and turn it into an intelligent query that acts somewhat like today's modern search engines in terms of wanting terms to be close to each other. This is such a "basic" use case of a search system that I'm tempted to think there must be well worn paths for doing this in Lucene. Thanks, Daniel --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org