lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ryan McKinley" <ryan...@gmail.com>
Subject Re: 'accumulate' copyField for faceting
Date Thu, 01 Mar 2007 22:27:42 GMT
On 3/1/07, Yonik Seeley <yonik@apache.org> wrote:
> On 3/1/07, Ryan McKinley <ryantxu@gmail.com> wrote:
> > Faceting is much happier if you use a single valued field, but my apps
> > all require multivalued fields:
>
> If by "happy" you mean performance, things should be better in the
> future though.
>

yes, performance.  The docs seems to say "avoid faceting on
multiValued fields if possible"

With SOLR-153, do you think that won't be an issue anymore?


> >
> > I'd like to use copyField to accumulate the multivalued fields into a
> > single field that can be efficiently faceted.
>
> Not sure I understand...  you don't want counts for aaa, bbb, and ccc
> separately, but you want counts for the combined values "aaa;bbb;ccc"?
>
> I'm not sure I see the usecases for this.
>

Maybe its clearer if i say

<arr name="subject">
  <str>San Francisco</str>
  <str>San Diego</str>
  <str>DC</str>
</arr>

I want facets for "San Francisco", "San Diego" and "DC", not "san"
"francisco", "diego", "dc".  I want the faceting to be as efficient as
it could/should be.  If i search for "San Fran" (or San Leandro) this
doc should show up.

I was suggesting using copyField with accumulate the cities into a
single field used for faceting:
  tokens[] = "San Francisco; San Diego; DC".split( ";" )

In my current setup, I have:

<field name="subject" type="string" indexed="true" stored="true"
multiValued="true"/>
<field name="subject_txt" type="text" indexed="true" stored="false"
multiValued="true"/>
<copyField source="subject" dest="subject_txt"  />

I facet on the multivalued field "subject" and search on the text
field "subject_txt" -- "subject" is stored as a "string" so that the
tokens resemble the input, and "subject_txt" is tokenized for search.
If i have to go through the overhead of copy field to make search and
faceting work nice together, it may as well be configured to be as
efficient as possible.  Should I ignore the problem for now, and bank
on SOLR-153?

Am i missing something?

thanks
ryan

Mime
View raw message