lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "David Smiley (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-12298) Index Full nested document Hierarchy For Queries (umbrella issue)
Date Tue, 08 May 2018 15:00:00 GMT

    [ https://issues.apache.org/jira/browse/SOLR-12298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16467523#comment-16467523
] 

David Smiley commented on SOLR-12298:
-------------------------------------

I think I'm liking more and more getting rid of SolrInputDocument's \_childDocuments list.
 I think it would simplify some special-casing logic in some places, and it would add more
semantic information on the relationships.  There aren't _that_ many non-test accessors of
getChildDocuments() + addChildDocument() + addChildDocuments().  Some of those locations would
melt away if hypothetically anonymous children were added under the field key {{\_childDocuments\_}}.
 [~mkhludnev] you work with block joins a lot; what do you think of this refactoring proposal?
 I'm proposing removing explicit SolrInputField._childDocuments in favor of having fields
contains child document values.

If this all sounds good, lets create a sub-task about this refactoring.

Just an idea: imagine a new dummy FieldType called "ChildDocument".  In this way the schema
could explicitly capture the information that child docs exist at what name, and wether it's
single or multiValued.  This needn't come to pass until it's of use.  AddSchemaFieldsUpdateProcessorFactory
and AddUpdateCommand.flatten/recUnwrap need to navigate the child documents without knowing
at what names they will exist.  Perhaps it's reasonably efficient to just iterate all fields
in the document to look, but if the schema declared which fields have child relationships,
then it'd be faster.


> Index Full nested document Hierarchy For Queries (umbrella issue)
> -----------------------------------------------------------------
>
>                 Key: SOLR-12298
>                 URL: https://issues.apache.org/jira/browse/SOLR-12298
>             Project: Solr
>          Issue Type: Improvement
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: mosh
>            Priority: Major
>
> Solr ought to have the ability to index deeply nested objects, while storing the original
document hierarchy.
>  Currently the client has to index the child document's full path and level to manually
reconstruct the original document structure, since the children are flattened and returned
in the reserved "__childDocuments__" key.
> Ideally you could index a nested document, having Solr transparently add the required
fields while providing a document transformer to rebuild the original document's hierarchy.
>  
> This issue is an umbrella issue for the particular tasks that will make it all happen
– either subtasks or issue linking.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message