Return-Path: Delivered-To: apmail-lucene-dev-archive@www.apache.org Received: (qmail 71093 invoked from network); 8 May 2010 10:11:16 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 8 May 2010 10:11:16 -0000 Received: (qmail 10703 invoked by uid 500); 8 May 2010 10:11:15 -0000 Delivered-To: apmail-lucene-dev-archive@lucene.apache.org Received: (qmail 10377 invoked by uid 500); 8 May 2010 10:11:14 -0000 Mailing-List: contact dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@lucene.apache.org Delivered-To: mailing list dev@lucene.apache.org Received: (qmail 9587 invoked by uid 99); 8 May 2010 10:11:13 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 08 May 2010 10:11:13 +0000 X-ASF-Spam-Status: No, hits=-0.5 required=10.0 tests=AWL,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL,UNPARSEABLE_RELAY X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [77.238.184.62] (HELO smtp131.mail.ukl.yahoo.com) (77.238.184.62) by apache.org (qpsmtpd/0.29) with SMTP; Sat, 08 May 2010 10:11:08 +0000 Received: (qmail 65422 invoked from network); 8 May 2010 10:10:47 -0000 DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.co.uk; h=Received:X-Yahoo-SMTP:X-YMail-OSG:X-Yahoo-Newman-Property:Content-Type:Mime-Version:Subject:From:In-Reply-To:Date:Content-Transfer-Encoding:Message-Id:References:To:X-Mailer; b=ebvrr1IaW3adjmyLp+zsuJYnNC8OgeJleahUX0vL1a7APh7YWnof1h6AJZzOEtA0+OKK8pvJ3gdX62xXHtClR8GTW51at3Hxttb4eYfqT9SFyfjz9p9uAuHF37auXVkYxh1CTAQ0hggZxnLI52lnqwWxJTWIFtbwGtPE8FhF91c= ; Received: from i-194-106-34-5.freedom2surf.net (markharw00d@194.106.34.5 with plain) by smtp131.mail.ukl.yahoo.com with SMTP; 08 May 2010 10:10:46 +0000 GMT X-Yahoo-SMTP: rdTGKYaswBBzjeOz17cOw.4gNsDMDjs6 X-YMail-OSG: Ro_tyXwVM1mU.0m_oF_kopIc8bcplBzEriEHX71Of_JpZGULExSWXkCpyXkbmQNvjwNRLAkp2nb6lRv6MwTcFv_.si99BofgSjadYZ1YSacx4yx.q3NlVQg9uWqvg5LtrO3Te9t0O5n1n_IzRE1b1uX40zpM_TEB2imDf.e7xecHHAX9TSUJehN2rylrggM_TAdSbooYwsvTdkDo5aqah5pwIDQ8U91U.QB5QIzxwO9szJR0UnZv8c0n9dv8DmdGTPBhT.1mSj8C.iC8OEI7T0lSr9xQgw-- X-Yahoo-Newman-Property: ymail-3 Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Apple Message framework v1078) Subject: Re: Adding another dimension to Lucene searches From: Mark Harwood In-Reply-To: <4BE51387.7050304@getopt.org> Date: Sat, 8 May 2010 11:10:45 +0100 Content-Transfer-Encoding: quoted-printable Message-Id: <6F4496E2-F0F2-4732-A4C3-788C71BC0C53@yahoo.co.uk> References: <796818.71994.qm@web29007.mail.ird.yahoo.com> <4BE51387.7050304@getopt.org> To: dev@lucene.apache.org X-Mailer: Apple Mail (2.1078) OK, seems like there is some interest. I'll work on packaging the code/unit tests/demos and make it available. > matching ids ... but I didn't quite catch from the slides how you = encode > the parent-child link... is it just "the next docs are sub-documents > until the next parent doc"?=20 Yes - using physical proximity avoids any kind of costly look-ups and = allows efficient streaming/skipTo logic to work as per usual. The downside is the need to maintain sequences of related docs in the = same segment - something Lucene currently doesn't make easy with its = limited control over when segments are flushed. I suspect we'll need = some discussion on how best to support this. Another dependency is that Lucene maintains sequencing of documents when = merging segments together - this is something I think we can rely on = currently (please correct me if I'm wrong) but I would like to formalise = this with a Junit test or some other form of commitment which guarantees = this state of affairs. Cheers Mark On 8 May 2010, at 08:32, Andrzej Bialecki wrote: > On 2010-05-07 18:25, mark harwood wrote: >> I have been working on a hierarchical search capability for a while = now and wanted to see if there was general interest in adopting some of = the thinking into Lucene. >>=20 >> The idea needs a little explanation so I've put some slides up here = to kick things off: >>=20 >> = http://www.slideshare.net/MarkHarwood/proposal-for-nested-document-support= -in-lucene >=20 > Very cool stuff. If I understand the design correctly, the cost of the > query is roughly the same as constructing a Filter Query from the = parent > query, and then executing the child query with this filter. You = probably > use childScorer.skipTo(nextParentId) to avoid actually traversing all > matching ids ... but I didn't quite catch from the slides how you = encode > the parent-child link... is it just "the next docs are sub-documents > until the next parent doc"? or is it a field in the children that = points > to a unique id field of the parent? >=20 >=20 >=20 > --=20 > Best regards, > Andrzej Bialecki <>< > ___. ___ ___ ___ _ _ __________________________________ > [__ || __|__/|__||\/| Information Retrieval, Semantic Web > ___|||__|| \| || | Embedded Unix, System Integration > http://www.sigram.com Contact: info at sigram dot com >=20 >=20 > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org > For additional commands, e-mail: dev-help@lucene.apache.org >=20 --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional commands, e-mail: dev-help@lucene.apache.org