lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hoss Man (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-3535) Add block support for XMLLoader
Date Wed, 13 Jun 2012 17:15:43 GMT

    [ https://issues.apache.org/jira/browse/SOLR-3535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13294569#comment-13294569
] 

Hoss Man commented on SOLR-3535:
--------------------------------

bq. The necessity to treat multiple docs as a single update introduce complexity into the
update processor chain regardless.

Exactly.  No matter how we deal with this sort of thing in the external (xml/json/etc) APIs,
or in the internal (SolrInputDocument) APIs, the UpdateRequestProcessors are going to need
to be changed to explicitly understand the relationsihps of these docs -- so let's model things
in the way that makes the most sense and work from there -- with the added bonus that modeling
things the way they make the most sense should also be the easiest way to make it work with
SolrCloud.

My suggestion for an order of iterative implementation:

1) add "List<SolrInputDocument> getChildDocuments()" to SOlrInputDocument
2) make RunUpdateProcessor do the right thing with child docs
3) make the JavaBinCodec aware of getChildDocuments() so solrj can serialize/deserialize (which
should means SolrCloud can propogate them transparently)
4) get basic tests of hierarchical doc updates/deletes working in both standalone and solrcloud
mode

Then lots of other stuff can be done in parallel and doesn't gate each other...

* syntax in various loaders
** XML
** json
** DIH entities
* change simple update processors to know about nested docs (ie: field mutators)
* add new options/processors for more complex update processor use cases (ie: we'll probably
want SignatureUpdateProcessor to be able to do smething with the nested docs, etc...)

...but the bottom line is all of that stuff -- even the XML syntax -- is really secondary
to understanding the right way to deal with it in the internal APIs, and in my opinion that's
modeling as a true hierarchy in the SolrInputDocument class.

                
> Add block support for XMLLoader
> -------------------------------
>
>                 Key: SOLR-3535
>                 URL: https://issues.apache.org/jira/browse/SOLR-3535
>             Project: Solr
>          Issue Type: Sub-task
>          Components: update
>    Affects Versions: 4.1, 5.0
>            Reporter: Mikhail Khludnev
>            Priority: Minor
>         Attachments: SOLR-3535.patch
>
>
> I'd like to add the following update xml message:
> <add-block>
>     <doc>....</doc>
>     <doc>....</doc>
> </add-block>
> out of scope for now: 
> * other update formats
> * update log support (NRT), should not be a big deal
> * overwrite feature support for block updates - it's more complicated, I'll tell you
why
> Alt
> * wdyt about adding attribute to the current tag {pre}<add block="true">{pre} 
> * or we can establish RunBlockUpdateProcessor which treat every <add> ....</add>
as a block.
> *Test is included!!*
> How you'd suggest to improve the patch?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message