lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jim.Musil <Jim.Mu...@target.com>
Subject Re: An interesting approach to grouping
Date Tue, 27 Jan 2015 19:50:15 GMT
Yes, I’m trying to pin down exactly what conditions cause the bug to
appear. It seems as though it’s only when using the query function.

Jim

On 1/27/15, 12:44 PM, "Ryan Josal" <rjosal@gmail.com> wrote:

>This is great, thanks Jim.  Your patch worked and the sorting solution
>meets the goal, although group.limit seems like it could cut various
>results out of the middle of the result set.  I will play around with it
>and see if it proves helpful.  Can you let me know the Jira so I can keep
>an eye on it?
>
>Ryan
>
>On Tuesday, January 27, 2015, Jim.Musil <Jim.Musil@target.com> wrote:
>
>> Interestingly, you can do something like this:
>>
>> group=true&
>> group.main=true&
>> group.func=rint(scale(query({!type=edismax v=$q}),0,20))& // puts into
>> buckets
>> group.limit=20& // gives you 20 from each bucket
>> group.sort=category asc  // this will sort by category within each
>>bucket,
>> but this can be a function as well.
>>
>>
>>
>> Jim Musil
>>
>>
>>
>> On 1/27/15, 10:14 AM, "Jim.Musil" <Jim.Musil@target.com <javascript:;>>
>> wrote:
>>
>> >When using group.main=true, the results are not mixed as you expect:
>> >
>> >"If true, the result of the last field grouping command is used as the
>> >main result list in the response, using group.format=simple”
>> >
>> >https://wiki.apache.org/solr/FieldCollapsing
>> >
>> >
>> >Jim
>> >
>> >On 1/27/15, 9:22 AM, "Ryan Josal" <rjosal@gmail.com <javascript:;>>
>> wrote:
>> >
>> >>Thanks a lot!  I'll try this out later this morning.  If group.func
>>and
>> >>group.field don't combine the way I think they might, I'll try to look
>> >>for
>> >>a way to put it all in group.func.
>> >>
>> >>On Tuesday, January 27, 2015, Jim.Musil <Jim.Musil@target.com
>> <javascript:;>> wrote:
>> >>
>> >>> I¹m not sure the query you provided will do what you want, BUT I did
>> >>>find
>> >>> the bug in the code that is causing the NullPointerException.
>> >>>
>> >>> The variable context is supposed to be global, but when prepare() is
>> >>> called, it is only defined in the scope of that function.
>> >>>
>> >>> Here¹s the simple patch:
>> >>>
>> >>> Index: core/src/java/org/apache/solr/search/Grouping.java
>> >>> ===================================================================
>> >>> --- core/src/java/org/apache/solr/search/Grouping.java  (revision
>> >>>1653358)
>> >>> +++ core/src/java/org/apache/solr/search/Grouping.java  (working
>>copy)
>> >>> @@ -926,7 +926,7 @@
>> >>>       */
>> >>>      @Override
>> >>>      protected void prepare() throws IOException {
>> >>> -      Map context = ValueSource.newContext(searcher);
>> >>> +      context = ValueSource.newContext(searcher);
>> >>>        groupBy.createWeight(context, searcher);
>> >>>        actualGroupsToFind = getMax(offset, numGroups, maxDoc);
>> >>>      }
>> >>>
>> >>>
>> >>> I¹ll search for a Jira issue and open if I can¹t find one.
>> >>>
>> >>> Jim Musil
>> >>>
>> >>>
>> >>>
>> >>> On 1/26/15, 6:34 PM, "Ryan Josal" <ryan@josal.com <javascript:;>
>> <javascript:;>>
>> >>>wrote:
>> >>>
>> >>> >I have an index of products, and these products have a "category"
>> >>>which we
>> >>> >can say for now is a good approximation of its location in the
>>store.
>> >>>I'm
>> >>> >investigating altering the ordering of the results so that the
>> >>>categories
>> >>> >aren't interlaced as much... so that the results are a little bit
>>more
>> >>> >grouped by category, but not *totally* grouped by category.  It's
>> >>> >interesting because it's an approach that sort of compares results
>>to
>> >>> >near-scored/ranked results.  One of the hoped outcomes of this
>>would
>> >>>that
>> >>> >there would be somewhat fewer categories represented in the top
>> >>>results
>> >>> >for
>> >>> >a given query, although it is questionable if this is a good
>> >>>measurement
>> >>> >to
>> >>> >determine the effectiveness of the implementation.
>> >>> >
>> >>> >My first attempt was to
>> >>>
>> 
>>>>>>group=true&group.main=true&group.field=category&group.func=rint(scale
>>>>>>(q
>> >>>>u
>> >>>>er
>> >>> >y({!type=edismax
>> >>> >v=$q}),0,20))
>> >>> >
>> >>> >Or some FunctionQuery like that, so that in order to become a
>>member
>> >>>of a
>> >>> >group, the doc would have to have the same category, and be dropped
>> >>>into
>> >>> >the same score bucket (20 in this case).  This doesn't work out
of
>>the
>> >>> >gate
>> >>> >due to an NPE (solr 4.10.2) (although I'm not sure it would work
>> >>>anyway):
>> >>> >
>> >>> >java.lang.NullPointerException\n\tat
>> >>>
>> 
>>>>>>org.apache.lucene.queries.function.valuesource.ScaleFloatFunction.get
>>>>>>Va
>> >>>>l
>> >>>>ue
>> >>> >s(ScaleFloatFunction.java:104)\n\tat
>> >>>
>> 
>>>>>>org.apache.solr.search.DoubleParser$Function.getValues(ValueSourcePar
>>>>>>se
>> >>>>r
>> >>>>.j
>> >>> >ava:1111)\n\tat
>> >>>
>> 
>>>>>>org.apache.lucene.search.grouping.function.FunctionFirstPassGroupingC
>>>>>>ol
>> >>>>l
>> >>>>ec
>> >>> >tor.setNextReader(FunctionFirstPassGroupingCollector.java:82)\n\tat
>> >>>
>> 
>>>>>>org.apache.lucene.search.MultiCollector.setNextReader(MultiCollector.
>>>>>>ja
>> >>>>v
>> >>>>a:
>> >>> >113)\n\tat
>> >>>
>> 
>>>>>>org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:612)
>>>>>>\n
>> >>>>\
>> >>>>ta
>> >>> >t
>> >>>
>> 
>>>>>>org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:297)
>>>>>>\n
>> >>>>\
>> >>>>ta
>> >>> >t
>> >>>
>> 
>>>>>>org.apache.solr.search.Grouping.searchWithTimeLimiter(Grouping.java:4
>>>>>>51
>> >>>>)
>> >>>>\n
>> >>> >\tat
>> >>> >org.apache.solr.search.Grouping.execute(Grouping.java:368)\n\tat
>> >>>
>> 
>>>>>>org.apache.solr.handler.component.QueryComponent.process(QueryCompone
>>>>>>nt
>> >>>>.
>> >>>>ja
>> >>> >va:459)\n\tat
>> >>>
>> 
>>>>>>org.apache.solr.handler.component.SearchHandler.handleRequestBody(Sea
>>>>>>rc
>> >>>>h
>> >>>>Ha
>> >>> >ndler.java:218)\n\tat
>> >>> >
>> >>> >
>> >>> >Has anyone tried something like this before, and does anyone have
>>any
>> >>> >novel
>> >>> >ideas for how to approach it, no matter how different?  How about
a
>> >>> >workaround for the group.func error here?  I'm very open-minded
>>about
>> >>> >where
>> >>> >to go on this one.
>> >>> >
>> >>> >Thanks,
>> >>> >Ryan
>> >>>
>> >>>
>> >
>>
>>

Mime
View raw message