Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 72686 invoked from network); 15 Mar 2010 21:26:12 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 15 Mar 2010 21:26:12 -0000 Received: (qmail 53922 invoked by uid 500); 15 Mar 2010 21:25:23 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 53866 invoked by uid 500); 15 Mar 2010 21:25:23 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 53858 invoked by uid 99); 15 Mar 2010 21:25:23 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 15 Mar 2010 21:25:23 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=10.0 tests=AWL,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of rene.a.hackl@gmx.de designates 213.165.64.20 as permitted sender) Received: from [213.165.64.20] (HELO mail.gmx.net) (213.165.64.20) by apache.org (qpsmtpd/0.29) with SMTP; Mon, 15 Mar 2010 21:25:15 +0000 Received: (qmail invoked by alias); 15 Mar 2010 21:09:38 -0000 Received: from p5DCD2CA0.dip.t-dialin.net (EHLO [192.168.178.45]) [93.205.44.160] by mail.gmx.net (mp072) with SMTP; 15 Mar 2010 22:09:38 +0100 X-Authenticated: #24166002 X-Provags-ID: V01U2FsdGVkX1/7P4M9mfb7vxyeDQ9v6vPBMqDi85VnptWUgRm1Gu 93J5/a+PRVj0Fd Message-ID: <4B9EA211.4060206@gmx.de> Date: Mon, 15 Mar 2010 22:09:37 +0100 From: Rene Hackl-Sommer User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; de; rv:1.9.1.8) Gecko/20100227 Thunderbird/3.0.3 MIME-Version: 1.0 To: java-user@lucene.apache.org Subject: Re: Increase number of available positions? References: <4B9DF7DE.9070209@gmx.de> <359a92831003150536o5cfd573eq7523234346a5ab38@mail.gmail.com> <4B9E3D57.6050402@gmx.de> <359a92831003150921g3db5cadbo803d36c658b11b81@mail.gmail.com> In-Reply-To: <359a92831003150921g3db5cadbo803d36c658b11b81@mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Y-GMX-Trusted: 0 X-FuHaFi: 0.58999999999999997 Hi Erick, > What about indexing > the triplets with a small increment gap between? That is: > ... > gets indexed as: > > level1-1/level2-1/level3-1 +gap 100 > level1-1/level2-1/level3-2 +gap 100 > level1-1/level2-2/level3-3 +gap 100 > level1-1/level2-2/level3-4 > If I understand this correctly, the field would look like "level1-1/level2-1/level3-1 Term1 Term2 level1-1/level2-1/level3-2 Term3 Term4 "? I think, the problem here is the same like in the Payloads approach I wrote of in my response to Steve's mail. We cannot test for equality at search time (please correct me if we actually can do this). So if we have level1-1/level2-1/level3-1 ... level1-1/level2-1/level3-244 level1-1/level2-2/level3-1 level1-1/level2-2/level3-105 and I search for T1 and T2 on level3, but want them to be in the same level2, this cannot be done satisfactorily. > Or you could think about *documents* being your level1, that is each > document has one and only one level1 element but many documents > may have the same level1 token. Combining this with your increment > gap notion for level2-3 might work for you. > I was thinking about this, yet the trouble is that the issue at hand is just one field in an already not quite trivial scenario involving 200+ fields. If I add say 50 level1-documents per real document, I would still need to be able to relate these level1-documents to the real documents to which they belong, and, during retrieval, there are use cases where I need to look into each of the level1-documents to see if they fulfill certain criteria and then, in a further step, ascertain whether I can gather the needed level1-documents to fulfill the query on a "MyField"-Level (not existant here per se). I feel this might get somewhat unwieldy. > You might also search the list for "Heirarchal" or "tree" indexing, > this is a variant of such I think. > Thank you, I'll look into this. Cheers Rene --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org