lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Harwood <markharw...@yahoo.co.uk>
Subject Re: Adding another dimension to Lucene searches
Date Sat, 08 May 2010 10:10:45 GMT
OK, seems like there is some interest.
I'll work on packaging the code/unit tests/demos and make it available.


> matching ids ... but I didn't quite catch from the slides how you encode
> the parent-child link... is it just "the next docs are sub-documents
> until the next parent doc"? 

Yes - using physical proximity avoids any kind of costly look-ups and allows efficient streaming/skipTo
logic to work as per usual.

The downside is the need to maintain sequences of related docs in the same segment - something
Lucene currently doesn't make easy with its limited control over when segments are flushed.
I suspect we'll need some discussion on how best to support this.

Another dependency is that Lucene maintains sequencing of documents when merging segments
together - this is something I think we can rely on currently (please correct me if I'm wrong)
but I would like to formalise this with a Junit test or some other form of commitment which
guarantees this state of affairs.

Cheers
Mark


On 8 May 2010, at 08:32, Andrzej Bialecki wrote:

> On 2010-05-07 18:25, mark harwood wrote:
>> I have been working on a hierarchical search capability for a while now and wanted
to see if there was general interest in adopting some of the thinking into Lucene.
>> 
>> The idea needs a little explanation so I've put some slides up here to kick things
off:
>> 
>> http://www.slideshare.net/MarkHarwood/proposal-for-nested-document-support-in-lucene
> 
> Very cool stuff. If I understand the design correctly, the cost of the
> query is roughly the same as constructing a Filter Query from the parent
> query, and then executing the child query with this filter. You probably
> use childScorer.skipTo(nextParentId) to avoid actually traversing all
> matching ids ... but I didn't quite catch from the slides how you encode
> the parent-child link... is it just "the next docs are sub-documents
> until the next parent doc"? or is it a field in the children that points
> to a unique id field of the parent?
> 
> 
> 
> -- 
> Best regards,
> Andrzej Bialecki     <><
> ___. ___ ___ ___ _ _   __________________________________
> [__ || __|__/|__||\/|  Information Retrieval, Semantic Web
> ___|||__||  \|  ||  |  Embedded Unix, System Integration
> http://www.sigram.com  Contact: info at sigram dot com
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message