lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "David Smiley (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-12298) Index Full nested document Hierarchy For Queries (umbrella issue)
Date Mon, 07 May 2018 21:52:00 GMT

    [ https://issues.apache.org/jira/browse/SOLR-12298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16466517#comment-16466517
] 

David Smiley commented on SOLR-12298:
-------------------------------------

bq.  that will not prevent the need for a new JSON loader, since the current one requires
the \_childDocument\_ to be added at each level.

I'm not sure I understand you.  Firstly, \_childDocument\_ isn't required; you could instead
provide an array as a JSON value and it'll be assumed to contain multiple child documents.
 JsonLoaderTest.PARENT_TWO_CHILDREN_JSON demonstrates that.  Secondly, I don't get the point/consequence
of why this is a problem.

Nevertheless, I can see that  SolrInputDocument.getChildDocuments doesn't capture the nature
of the relationship, plus some child docs may have a varying relationship.  The JSON structure
will usually have labels that are indicative of that relationship, like a "comments" array
of child docs on a blog post, and perhaps an "author" child doc for the author of the post
(if we imagine modeling it this way).  

Still; it'd be a shame if a solution here were fixed to JSON only, so I'm stubborn on going
with an URP for at least part of this.  Perhaps if the SolrInputDocument held optional contextual
metadata populated by JsonLoader, then an URP could use this information.  Lacking that information
it could work in a general way (e.g. assume simply "child" relationship).  Or... what if SolrInputDocument
did not have an explicit \_childDocuments field list.  What if a SolrInputDocument was simply
a supported value inside SolrInputField? That would be a bigger change but may be a more appropriate
fix, since adding the relationship after the fact (what we're talking about) could be seen
a hack on top of SolrInputDocument which doesn't capture it natively when it should.  I'll
sleep on it.

> Index Full nested document Hierarchy For Queries (umbrella issue)
> -----------------------------------------------------------------
>
>                 Key: SOLR-12298
>                 URL: https://issues.apache.org/jira/browse/SOLR-12298
>             Project: Solr
>          Issue Type: Improvement
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: mosh
>            Priority: Major
>
> Solr ought to have the ability to index deeply nested objects, while storing the original
document hierarchy.
>  Currently the client has to index the child document's full path and level to manually
reconstruct the original document structure, since the children are flattened and returned
in the reserved "__childDocuments__" key.
> Ideally you could index a nested document, having Solr transparently add the required
fields while providing a document transformer to rebuild the original document's hierarchy.
>  
> This issue is an umbrella issue for the particular tasks that will make it all happen
– either subtasks or issue linking.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message