lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hoss Man (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-12298) Index Full nested document Hierarchy For Queries (umbrella issue)
Date Wed, 09 May 2018 21:53:00 GMT

    [ https://issues.apache.org/jira/browse/SOLR-12298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16469560#comment-16469560
] 

Hoss Man commented on SOLR-12298:
---------------------------------

bq. Quoting Hoss Man here inline (hoping for his input):

I don't have a lot of input / opinions on this general topic of childDocs at the moment...

>From skimming the issue description & last few comments I gather the push here is
to make the "arbitrary nested documents of different types" experience/API for *external clients*
simpler/easier/cleaner/sexier ... and to then have rules/conventions enforced by Solr (either
via URPs or the underlying DUH, ... not certain which exactly is being suggested at the moment)
handle the mapping of those "external relationships" into the *internal* nested childDocs
w/new fields based on the original hierarchy.

Ie: an external client could psueod-code  documents that look like via {{parentDoc[ ...normalfields...,
someFieldName => childDoc1[...], someOtherFieldName => childDoc2[...] ]}} and then something
in solr would translate that into the _internal_  representation of nested documents by moving
that relationship info into fields of the child documents ala... {{parentDoc[ ...normalfields...,
\_childDocuments\_=>[ childDoc1[ ..., typeField => someFieldName], childDoc2[ ..., typeField
=> someOtherFieldName]]}} .  (And i guess, also add some ohter standard metadata fields
to every doc like what the type of the ancestors are?)

Or to put it another way: give solr the power to do _internally_ what clients currently have
to do _externally_ to model this information.

is that about right?

This approach seems fine in general ... off the top of my head the biggest concern i can think
of is how how you make something like the JSON ContentLoader smart enough to tell the differnece
between a "child document expressed as JSON object/map (in a field)" from "atomic update (of
a field) as a JSON object/map" (not an issue with the XML ContentLoader since the {{<doc/>}}
tag is distinct from the {{<lst/>}}

FWIW: i don't particularly remember making those ~6year old comments/questions fromSOLR-3535
cited here, but i'm guessing my concerns at the time where just that these questions all needed
answered in order to take that leap, and that until/unless we had those answers it seemed
simplest to move forward with exposing a "lower level" modeling of child documents so users
could take advantage of it ... if we're ready to answer those questions to support a cleaner/simpler
API then by all means let's support it)

> Index Full nested document Hierarchy For Queries (umbrella issue)
> -----------------------------------------------------------------
>
>                 Key: SOLR-12298
>                 URL: https://issues.apache.org/jira/browse/SOLR-12298
>             Project: Solr
>          Issue Type: Improvement
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: mosh
>            Priority: Major
>
> Solr ought to have the ability to index deeply nested objects, while storing the original
document hierarchy.
>  Currently the client has to index the child document's full path and level to manually
reconstruct the original document structure, since the children are flattened and returned
in the reserved "__childDocuments__" key.
> Ideally you could index a nested document, having Solr transparently add the required
fields while providing a document transformer to rebuild the original document's hierarchy.
>  
> This issue is an umbrella issue for the particular tasks that will make it all happen
– either subtasks or issue linking.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message