lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexandre Rafalovitch <arafa...@gmail.com>
Subject Re: Request for sanity check: SOLR-9477 (UpdateRequestProcessors ignore child documents)
Date Sat, 10 Sep 2016 13:14:17 GMT
Only AddSchemaFieldsUpdateProcessorFactory deals with the children.
None others do.

So, I am looking at this and as a first group I am looking at
DefaultValue URPs, of which we have 3:
* DefaultValueUpdateProcessorFactory
* TimestampUpdateProcessorFactory
* UUIDUpdateProcessorFactory

Of those, UUIDUpdateProcessorFactory - to me - should always apply on
all levels. Mikhail said we need a paradigm shift on the whole ID
thing, but while we wait for that, we need to assign those unique IDs
or see the import fail.

On the other hand, both Timestamp and DefaultValue URPs are more
complex as they both actually create a field. With timestamp, it seems
that applying it universally would mean it would be on all nested
documents, however many levels down. Seems redundant.

Default value URP is more complex again - the 'price' element given in
the example could be in a parent or in a child. Or in a grandchild.
And may not make sense on the other levels. What's the best way to
deal with that?

Obviously, this discussion applies to all of the URPs, I am just using
concrete example here to make the discussion more concrete.

Regards,
   Alex.


----
Newsletter and resources for Solr beginners and intermediates:
http://www.solr-start.com/


On 7 September 2016 at 22:01, David Smiley <david.w.smiley@gmail.com> wrote:
> I think we need to document which URPs apply to child docs (if any) and
> otherwise say that, unless otherwise noted, URPs only apply to the root doc.
>
> It would be nice if one could simply configure at the URP declaration if it
> applies to children only, root doc only, or both.  I figure *most* URPs
> don't need internal coding to pick.  If users could pick at the declaration
> to which it applies, we might then also want the ability for some URPs to
> declare that they only work with root docs and then Solr could give you a
> helpful error if you misuse it.
>
> ~ David
>
> On Wed, Sep 7, 2016 at 4:38 AM Alexandre Rafalovitch <arafalov@gmail.com>
> wrote:
>>
>> On 7 September 2016 at 11:52, Mikhail Khludnev <mkhl@apache.org> wrote:
>> >> *) Do we need to document this as a limitation?
>> >
>> > Yes. Sure. But it's nowhere promised that update processors are applied
>> > on
>> > children.
>> I feel this violates the expectation of least surprise. For example, I
>> have discovered this issue by creating a sample dataset that ended-up
>> with a blank Date field in the child record. So, I figured this is
>> easy to fix Solr side by just adding RemoveBlankField URP. And.... it
>> did not work. Took me an hour to figure out that _none_ of the URPs
>> work at all. More importantly, it means we have no solutions for the
>> child documents for all those problems we solved over time with URPs.
>> Which we need to document and/or fix.
>>
>> >> *) Do we need to fix it individually or there is a smart way to do it
>> >> centrally?
>>
>> > Probably yes, but I prefer to aim someones real life challenge, than
>> > solve an abstract
>> > common sense problem. eg.
>> > how to apply a processor to children only but not to parent? Another
>> > case is ID generation
>> > - it's often required
>> > to generate it for parent but then implicitly propagate to children, but
>> > it requires paradigm
>> > shift:
>> > uniqueKey should be assigned on a block not a single document. This will
>> > fix when
>> > childless docs are mixed with blocks, etc.
>> > Perhaps, it make sense to create a processed which applies a certain
>> > chain of processors
>> > to children docs only?
>>
>> Ok, these are interesting questions that can apply to specific URPs.
>> This suggests to me that each one (or each group) should be deal with
>> separately.
>>
>> > Frankly speaking, block support are yet raw, and this limitation is not
>> > considered
>> > significant at comparison with other ones.
>>
>> Well, we support them in XML, JSON, DIH, queries, etc. It may be raw
>> but it is usable and people are using it. So - to me - it is a valid
>> issue and worth thinking about.
>>
>> Regards,
>>     Alex.
>> ----
>> Newsletter and resources for Solr beginners and intermediates:
>> http://www.solr-start.com/
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>
> --
> Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
> LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
> http://www.solrenterprisesearchserver.com

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message