lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Rosen <p...@performantsoftware.com>
Subject Re: Updating a solr record
Date Thu, 27 Aug 2009 20:17:18 GMT
Hi Eric,

I think I understand what you are saying but I'm not sure how it would work.

I think you are saying to have two different indexes, each one has the 
same documents, but one has the hard-to-get fields and the other has the 
easy-to-get fields. Then I would make the same query twice, once to each 
index.

So, let's say I'm looking for all documents that contain the word "poem" 
and I want to initially display the the 10 most relevant matches. I 
think I'd have to ask each index for its 10 most relevant matches, then 
merge them myself, and display the appropriate ones.

Well, the same document could appear in both lists so I'd have to get 
rid of duplicates. Also, wouldn't the relevancy of the duplicate doc go 
up? But I wouldn't know by how much.

That's the first problem, but then what if the user wants to see page 2? 
I certainly wouldn't query for documents #10-19 on each server.

Eric Pugh wrote:
> Right...  You know, if some of your data needs to updated frequently,
> but other is updated once per year, and is really massive dataset,
> then maybe splitting it up into separate cores?  Since you mentioned
> that you can't get the raw data again, you could just duplicate your
> existing index by doing a filesytem copy.  Leave that alone so you
> don't update it and lose your data, and start a new core that you can
> update and ignore the fact is has all the website data in it.  And tie
> the two cores data sets together outside of Solr.
> 
> Eric
> 
> 
> 
> On Thu, Aug 27, 2009 at 1:46 PM, Paul Tomblin<ptomblin@xcski.com> wrote:
>> On Thu, Aug 27, 2009 at 1:27 PM, Eric
>> Pugh<epugh@opensourceconnections.com> wrote:
>>> You can just query Solr, find the records that you want (including all
>>> the website data).  Update them, and then send the entire record back.
>>>
>> Correct me if I'm wrong, but I think you'd end up losing the fields
>> that are indexed but not stored.
>>
>>
>> --
>> http://www.linkedin.com/in/paultomblin
>>


Mime
View raw message