lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mark Harwood (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-2454) Nested Document query support
Date Tue, 21 Jun 2011 08:54:47 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-2454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13052436#comment-13052436
] 

Mark Harwood commented on LUCENE-2454:
--------------------------------------

bq. This overlaps with the BlockJoinQuery of LUCENE-3171, this issue might even be closed
as duplicate of that one. Which one is preferred?

We need to look at the likely use cases. 2454 was created to service a use case which I expect
to be a very common pattern and I'm not sure if LUCENE-3171 satisfies this need. Apps commonly
need to return a selection of both matching and non-matching children along with the "best"
parents. Why? - it's a very similar rationale to the way that highlighting returns a summary
of text - it doesn't just return the matched words, it also returns surrounding text as useful
context when displaying results to users. However, some texts can be very large and there's
a need to limit what context is brought back.
If we apply this logic to 2454 we can see that for the top parents it is common to also want
some non-matching children (e.g. for a resume return a person's employment history - not just
the employments that matched the original search) but it is also necessary to summarize some
parent's history (e.g. the contractor who listed a gazillion positions in his employment history
needs summarising). A common pattern is for solutions to ask for the best 11 children for
the best parents and display only 10 - that way the app knows that for certain parents there
is more data available (i.e. those with 11 matches) and can offer a "more" button to retrieve
the extra children for parents of interest. 2454 satisfies this use case as follows:
# Use a NestedDocumentQuery to get best parents with child criteria expressed as a "must"
# Use a PerParentLimitedQuery to get a selection of children per top parent where MUST belong
to a top parent (tested using primary key) and use the child criteria again but this time
as a "SHOULD" clause to relevance rank the selection of children returned

It's worth considering this sort of use case carefully before making any code decisions.



> Nested Document query support
> -----------------------------
>
>                 Key: LUCENE-2454
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2454
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: core/search
>    Affects Versions: 3.0.2
>            Reporter: Mark Harwood
>            Assignee: Mark Harwood
>            Priority: Minor
>         Attachments: LUCENE-2454.patch, LUCENE-2454.patch, LuceneNestedDocumentSupport.zip
>
>
> A facility for querying nested documents in a Lucene index as outlined in http://www.slideshare.net/MarkHarwood/proposal-for-nested-document-support-in-lucene

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message