Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 27434 invoked from network); 4 Mar 2011 10:35:44 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 4 Mar 2011 10:35:44 -0000 Received: (qmail 24431 invoked by uid 500); 4 Mar 2011 10:35:41 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 24354 invoked by uid 500); 4 Mar 2011 10:35:41 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 24343 invoked by uid 99); 4 Mar 2011 10:35:41 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 04 Mar 2011 10:35:41 +0000 X-ASF-Spam-Status: No, hits=0.6 required=5.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL,URI_HEX X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of ian.lea@gmail.com designates 209.85.214.176 as permitted sender) Received: from [209.85.214.176] (HELO mail-iw0-f176.google.com) (209.85.214.176) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 04 Mar 2011 10:35:35 +0000 Received: by iwr19 with SMTP id 19so2405028iwr.35 for ; Fri, 04 Mar 2011 02:35:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:from:date :message-id:subject:to:content-type; bh=MMlkcJsD4WyVT/sgK7xtbKoBlRPG7XVGwVo24m9XvxI=; b=ZyhA9w3VuRZrP1rMLU/hnluUzWVpT2+s8LexR6bmdqk5TzJ6+TG2J+sHXjhx1PO5fL 81HwSR/Z/yza+ADv3+Hgaz3S2NJKGlLE6psX0InjknGKr7GbWHRbOtj60YTlbOpg1V7i 3OYd1lBB7L71Se9grLkxkrVp7yaDMOWOJ2lz0= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; b=KCY6OcbMn6zr4Oeep0OvoKsscVKtQECF044PXYtiq/SnJ1g8TzOD/tENtDTUl0Wdy9 Buy0ZZuLVGTHRPZuA+LQbWfCEX32SUSeQqvs6ToDRPu+8nsqmuAam5+l6MXm9wLr+3+T I2WrRN7dVQuPg3y4iNqtb89rw4p/mvygYOu9k= Received: by 10.42.66.7 with SMTP id n7mr631191ici.20.1299234914115; Fri, 04 Mar 2011 02:35:14 -0800 (PST) MIME-Version: 1.0 Received: by 10.231.200.147 with HTTP; Fri, 4 Mar 2011 02:34:54 -0800 (PST) In-Reply-To: <4D708F60.5090507@lsv.uni-saarland.de> References: <4D708F60.5090507@lsv.uni-saarland.de> From: Ian Lea Date: Fri, 4 Mar 2011 10:34:54 +0000 Message-ID: Subject: Re: index enforcing query terms to appear within the same sentence To: java-user@lucene.apache.org Content-Type: text/plain; charset=ISO-8859-1 X-Virus-Checked: Checked by ClamAV on apache.org You can use multi valued fields if you play with the position increment gap. See e.g. http://lucene.472066.n3.nabble.com/Problem-searching-in-the-same-sentence-td1501269.html A google search for "lucene indexing sentences" or similar finds that, and more. Different docs can have different fields/different numbers of fields, but the position gap approach is probably better. -- Ian. On Fri, Mar 4, 2011 at 7:06 AM, Michael Wiegand wrote: > Hi, > > I would like to create an index with Lucene to a document collections of > text files. > The index should be created in such a way, that for the search I can enforce > that query term A and query term B are contained within the same sentence. > > How should implement the index? Should I have for every sentence a different > field (but make sure that it is not a multi-valued field because they would > get merged which is exactly what I do not want)? > Would it be problematic that different documents would then end up having > different numbes of fields? > > Thank you in advance! > > Best, > Michael > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org