lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: revisit naming for grouping/join?
Date Wed, 06 Jul 2011 12:36:02 GMT
On Tue, Jul 5, 2011 at 4:58 PM, Chris Hostetter
<hossman_lucene@fucit.org> wrote:

> : > if so, then indepenent of the Collector, "ParentDocumentQuery" (or
> : > ParentDocumentQueryWrapper) still seems like it makes the most sense.
> :
> : Hmm, but that doesn't convey that it handles this nesting, ie, that
> : it's joining child docs with parent docs.
>
> you lost me there ... i feel like i'm not understanding your use of
> "joining"
>
> If...
>  * x is the parent of x1, x2, etc...
>  * y is the parent of y1, y2, etc...
>  * queryA matches w5, x1, x2, and z3
>  * queryB = BlockJoinQuery(queryA)
>
> ...then doesn't queryB only match w, x, and z?
>
> Isn't it just a query that wraps another query and returns the parents of
> the docs matched by the wrapped query?

Yes, it is... though, I think "normally" one would do a new
BlockJoinQuery per table (ie one for x, one for y, one for z); this
way you can pull child hits per-table.  But I believe your way (doing
3 joins w/ one BlockJoinQuery) will work fine (ie give the right
parent hits).

And this is the same thing as "joining" (relating the child matches
"up" to the corresponding parents), ie, this query "finishes" the join
done during indexing.

> : Also, these queries can be nested (from 2nd join in the star join),
> : and so it could be ChildAndGrandChildrenQuery.
>
> a) wouldn't that require you to wrap multiple of them? (ie: new
> BlockJoinQuery(new BlockJoinQuery(childQ, ...), ...)

Yes it would, ie, just like doing multiple joins.

> b) the use of child suggests that if the base query matches parents, then
> the wrapper will match children ... i think you really want it to clarify
> that the relationship goes the other way -- it returns parents (and grand
> parents, etc..) of the wrapped query ?

I think we want to emphasize both, that it expects child matches from
the wrapped query and then returns corresponding ("joined") parent
matches.

Also... I think we are over-thinking the name ;)  We can't convey
*everything* in this name; as long as the name makes it clear that
you'll want to consider this / read its javadocs whenever doing
something with "nested docs", I think that's sufficient.  I think
NestedQueryWrapper (maybe NestedDocsQuery) and NestedDocsCollector are
good enough, at least better than the functional-driven names they now
have...

> : I guess Wrapper would make sense since it wraps a query matching the
> : nested docs.  I think Document is redundant/implied?
> :
> : Maybe NestedQueryWrapper?
>
> ugh... that seems like it might be really confusing ... "nested what?" ...
> "nested query?" .. "of course it's a wrapper, it wraps a nested query"

Yeah I think maybe drop the Wrapper and add Docs: NestedDocsQuery?

> another reason why naming they query after the 'parents' might better
> explain what it does.

I think parents is under-stating it, since it's really a bidirectional
thing.  Ie the BlockJoinQuery interacts with both parent and child
hits.

> if i'm missunderstanding the docs, and it can go all the way up the
> taxonomy w/o having to use nest instances inside other instances then
> maybe "AncestorDocumentQueryWrapper" or "OuterDocumentQueryWrapper" could
> make sense?
>
> Dare i suggest "WrappingDocumentQueryWrapper" ?

Too long :)

> (i hate naming shit ... good names are too fucking hard .. and i can't
> find any antonyms for "nested" in the context we mean)

Naming is the hardest part :)

> : In fact, once we generalize TopDocs so that the type of each hit can
> : be parameterized then this collector would return TopDocs<NestedDoc>
>        ...
> : So I guess I would keep Top but drop Groups, and replace
> : ParentChildren with NestedDocs and move the Top in front:
> : TopNestedDocsCollector.
>
> yep, yep ... sounds just ... just feel like we need
> something better then anything we've come up for so far for the query,
> something to adequately explain that it (essenially) does the inverse of
> the collector -- going up the taxonomy and matching the parents/wrappers
> of the nested documents matched by the base query.

Honestly at this point I'm tempted to just stick with what we have
(the functionally driven names, instead of the dominant use case
driven name).

At its heart, this query is performing a join (well, finishing the
join that was done during indexing), and despite our efforts to more
descriptively capture the dominant use case, I don't think we're
succeeding.  We are basically struggling to find ways to explain what
a join does, into these class names.

Mike McCandless

http://blog.mikemccandless.com

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message