lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: how does Solr/Lucene index multi-value fields
Date Tue, 31 May 2011 16:27:35 GMT
Hmmm, I may have mis-lead you. Re-reading my text it
wasn't very well written....

TF/IDF calculations are, indeed, per-field. I was trying
to say that there was no difference between storing all
the data for an individual field as a single long string of text
in a single-valued field or as several shorter strings in
a multi-valued field.

Best
Erick

On Tue, May 31, 2011 at 12:16 PM, Ian Holsman <hadoop@holsman.net> wrote:
>
> On May 31, 2011, at 12:11 PM, Erick Erickson wrote:
>
>> Can you explain the use-case a bit more here? Especially the post-query
>> processing and how you expect the multiple documents to help here.
>>
>
> we have a collection of related stories. when a user searches for something, we might
not want to display the story that is most-relevant (according to SOLR), but according to
other home-grown rules.  by combing all the possibilities in one SolrDocument, we can avoid
a DB-hit to get related stories.
>
>
>> But TF/IDF is calculated over all the values in the field. There's really no
>> difference between a multi-valued field and storing all the data in a
>> single field
>> as far as relevance calculations are concerned.
>>
>
> so.. it will suck regardless.. I thought we had per-field relevance in the current trunk.
:-(
>
>
>> Best
>> Erick
>>
>> On Tue, May 31, 2011 at 11:02 AM, Ian Holsman <hadoop@holsman.net> wrote:
>>> Hi.
>>>
>>> I want to store a list of documents (say each being 30-60k of text) into a single
SolrDocument. (to speed up post-retrieval querying)
>>>
>>> In order to do this, I need to know if lucene calculates the TF/IDF score over
the entire field or does it treat each value in the list as a unique field?
>>>
>>> If I can't store it as a multi-value, I could create a schema where I put each
document into a unique field, but I'm not sure how to create the query to search all the fields.
>>>
>>>
>>> Regards
>>> Ian
>>>
>>>
>
>

Mime
View raw message