lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Hatcher <erik.hatc...@gmail.com>
Subject Re: new facet parameter: facet.exists=true
Date Tue, 30 Mar 2010 15:06:29 GMT
Faceting on a "facet_fields" field will only have a handful (most  
likely) or less values so you'd be able to have that particular  
faceting cached to use quickly.  I'm not sure how much memory it'd  
take up, but certainly not as much as actually faceting on the fields  
themselves.

However, another approach you could take is to use facet.query.   
facet.query=some_facet_field:[* TO *] will return back a non-zero  
number if there are any documents in the results that have  
some_facet_field with a value.  You'd of course need to add a separate  
facet.query parameter for each field you cared about.

	Erik

On Mar 30, 2010, at 10:45 AM, Gregor Kaczor wrote:

> I am not sure if i got your approach right. If i did not, please  
> explain where the advantages are in time and memory footprint.
>
> In my opinion faceting on facet field names does not avoid counting  
> facets. If my result set is huge so will be the facet numbers on on  
> the field of facet names. It does not seem to me like saving memory  
> and time.
>
> My idea is to stop counting facets after finding one. It would show  
> that for a certain query there are some categories available. My aim  
> is to keep the memory footprint low while still beeing able to facet  
> >10^7 of documents. A problem i am dealing with right now.
>
> -------- Original-Nachricht --------
>> Datum: Tue, 30 Mar 2010 08:46:23 -0400
>> Von: Erik Hatcher <erik.hatcher@gmail.com>
>> An: java-dev@lucene.apache.org
>> Betreff: Re: new facet parameter: facet.exists=true
>
>> One trick to doing this is to index a field that lists the facet  
>> field
>> names that each document possesses.  Then you can facet on the field
>> of field names (sounds confusing, sorry) and you'll know if there are
>> any documents in a result set that have values in, say, a "category"
>> field.
>>
>> There's actually a basic patch out there that'll do this
>> automatically: https://issues.apache.org/jira/browse/SOLR-1280  - it
>> needs a bit of polish, but that's the general idea.
>>
>> 	Erik
>>
>> On Mar 30, 2010, at 7:46 AM, Gregor Kaczor wrote:
>>
>>> Facetting in indexes with document volumes exceeding twenty million
>>> documents is a time and particularly memory consuming search.
>>>
>>> In such huge indexes i am not interested if there is 4 or 5 million
>>> documents of a special type, i just want to know there are some and
>>> if i choose that facet will i get a list of results.
>>>
>>> Such an option would just count the first occurance of a facet term
>>> and return it without doing much of computation.
>>>
>>> I cound not figure out how to get that behaviour with existing
>>> facetting parameters.
>>>
>>> What do you think?
>>>
>>> Gregor Kaczor
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message