lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Young, Cody" <Cody.Yo...@move.com>
Subject RE: Grouping on Long type uses function query?
Date Tue, 29 Nov 2011 23:30:23 GMT
Hi Martijn,

Thanks for the response!

Doesn't it take a lot more memory to hold a string field in the FieldCache than a long field?


In our grouping scenario, we have many unique values with a small number of documents per
group. I would think that even the double FieldCache memory hit on a long would be less than
using a string. 
 
Would this is a suitable place to have a grouping parameter to control the behavior? group.method?
I'm looking at using the BlockGroupingCollector as well, perhaps "block" could be another
choice?
The downside being that there are invalid combinations. (You wouldn’t change group.method
to anything else if you were using a function to group)

Thanks,
Cody

-----Original Message-----
From: martijn.is.hier@gmail.com [mailto:martijn.is.hier@gmail.com] On Behalf Of Martijn v
Groningen
Sent: Tuesday, November 29, 2011 2:09 PM
To: dev@lucene.apache.org
Subject: Re: Grouping on Long type uses function query?

If I remember correctly this was done to avoid insane FieldCache usage.

If Term based grouping implementation is used then for that field an entry is created in the
FieldCache of type DocTermsIndex. It might then happen that for other search features like
sorting and faceting a second entry is created in the FieldCache. Sorting for example will
put in your case a new entry for this field in the FieldCache of type long. When the Function
based grouping implementations are used this is not the case. Only one cache entry of type
long is put in the FieldCache and sorting or faceting will reuse these entries.

The downside of the Function based grouping implementations is that they are slower then the
Term based implementation.
At the time this feature was integrated into Solr the decision was made to not have double
FieldCache usage per field and use the slower Function based implementation for non string
fields.

The work around that doesn't involve coding is the make a copy field of type string, but then
you add more fields / data to your index...

On 29 November 2011 22:25, Young, Cody <Cody.Young@move.com> wrote:
> Hi All,
>
>
>
> I’m new to solr development. Since I’m new with the code base, I 
> thought I’d double check here before making a JIRA issue. We’re trying 
> to use grouping on a field with a type of long (on trunk):
>
>     <fieldType name="long" class="solr.TrieLongField" precisionStep="0"
> omitNorms="true" positionIncrementGap="0"/>
>
>
>
> The performance wasn’t what we were looking for so I’m taking a quick 
> look at the grouping code in solr and I noticed that a string field 
> uses the Term grouping classes (CommandField in 
> /trunk/solr/core/src/java/org/apache/solr/search/Grouping.java). 
> However, when using a long field the Function grouping classes get 
> used (CommandFunc in 
> /trunk/solr/core/src/java/org/apache/solr/search/Grouping.java). When 
> I change it over to using CommandField instead of CommandFunc for long 
> type I get a decrease in QTime (I only did light testing, and just simple queries but
it seemed to drop by 50% or so).
>
>
>
> The functionality appears to still work and the grouping tests pass, 
> but as I’m not very familiar with the solr code I wasn’t sure if there 
> was a reason for Long to use CommandFunc instead of CommandField.
>
>
>
> I’m happy to take a stab at making a JIRA issue and a patch if this is 
> indeed an issue, but I’ll need some guidance on the best way to fix 
> this (perhaps instead of using instanceof StrFieldSource or instanceof 
> LongFieldSource there is a better way to check?).
>
>
>
> The change I made to test this was very simple, I just added:
>
>
>
> import org.apache.lucene.queries.function.valuesource.LongFieldSource;
>
>
>
> and at Line 176 of Grouping.java
>
>      } else if(valueSource instanceof LongFieldSource) {
>
>          String field = ((LongFieldSource) valueSource).getField();
>
>          CommandField commandField = new CommandField();
>
>          commandField.groupBy = field;
>
>          gc = commandField;
>
>
>
> Thanks,
>
> Cody



--
Met vriendelijke groet,

Martijn van Groningen

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional commands, e-mail:
dev-help@lucene.apache.org

Mime
View raw message