Return-Path: X-Original-To: apmail-lucene-java-user-archive@www.apache.org Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 1CB65EEDF for ; Thu, 21 Feb 2013 13:46:09 +0000 (UTC) Received: (qmail 41274 invoked by uid 500); 21 Feb 2013 13:46:06 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 41217 invoked by uid 500); 21 Feb 2013 13:46:05 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 41165 invoked by uid 99); 21 Feb 2013 13:46:04 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 21 Feb 2013 13:46:04 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [193.196.8.10] (HELO linux3.ids-mannheim.de) (193.196.8.10) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 21 Feb 2013 13:45:58 +0000 Received: from linux2.ids-mannheim.de ([10.0.1.1]) by linux3.ids-mannheim.de with smtp (Exim 4.72) (envelope-from ) id 1U8WSt-0001Mg-98 for java-user@lucene.apache.org; Thu, 21 Feb 2013 14:45:37 +0100 Received: (qmail 11579 invoked from network); 21 Feb 2013 13:45:37 -0000 Received: from unknown (HELO ?10.99.1.49?) (10.99.1.49) by linux2.ids-mannheim.de with SMTP; 21 Feb 2013 13:45:37 -0000 Message-ID: <512624FE.6020201@ids-mannheim.de> Date: Thu, 21 Feb 2013 14:45:34 +0100 From: Carsten Schnober Organization: Institut =?ISO-8859-15?Q?f=FCr_Deutsche_Sprache?= User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130106 Thunderbird/17.0.2 MIME-Version: 1.0 To: java-user@lucene.apache.org Content-Type: text/plain; charset=ISO-8859-15 Content-Transfer-Encoding: 8bit X-SA-Do-Not-Run: Yes X-SA-Exim-Connect-IP: 10.0.1.1 X-SA-Exim-Rcpt-To: java-user@lucene.apache.org X-SA-Exim-Mail-From: schnober@ids-mannheim.de X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on linux3.ids-mannheim.de X-Spam-Level: Subject: ProximityQueryNode X-SA-Exim-Version: 4.2.1 (built Mon, 03 Jul 2006 09:34:15 +0200) X-SA-Exim-Scanned: Yes (on linux3.ids-mannheim.de) X-Virus-Checked: Checked by ClamAV on apache.org X-Old-Spam-Status: No, score=-1.1 required=3.0 tests=BAYES_00,RDNS_NONE, TO_NO_BRKTS_NORDNS autolearn=no version=3.3.2 Hi, I'm interested in the functionality supposedly implemented through ProximityQueryNode. Currently, it seems like it is not used by the default QueryParser or anywhere else in Lucene, right? This makes perfectly sense since I don't see a Lucene index store any notion of sentences, paragraphs, etc. Is that right too? I would be interested whether anyone (else) is working on implementing this into some query parser and on any theoretical and practical approaches about indexing the given types. Also, I think that the type should (at some point in the future) be more flexible than the given values enumerated in the class so that one could also index arbitrary custom units, e.g. pages, discourse units, syntactic chunks, etc. My current approach on indexing sentence and paragraph information is to store them in token payloads and then perform a check in matching tokens whether their respective sentences match the given distance query. Any better ideas? Best, Carsten -- Institut f�r Deutsche Sprache | http://www.ids-mannheim.de Projekt KorAP | http://korap.ids-mannheim.de Tel. +49-(0)621-43740789 | schnober@ids-mannheim.de Korpusanalyseplattform der n�chsten Generation Next Generation Corpus Analysis Platform --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org