lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Uri Boness <ubon...@gmail.com>
Subject Re: Updating a solr record
Date Fri, 28 Aug 2009 00:35:42 GMT
I guess if you have stored="true" then there is no problem.

> 2. If you don't use stored="true" you can still get access to term vectors,
> which you can probably reuse to create fake field with same term vector in
> an updated document... just an idea, may be I am wrong...
Reconstructing a the field value from a term enum might work... of 
course the value won't be as the original value, but when indexed, if 
you don't have any really special filters (e.g. shingle filter), most 
likely the tokens will be re-indexed as they are (that is, it is most 
likely that the filters will not have any effect). just make sure to 
take the position increments in account! for example, if you have 
synonym filter set up, then you'll need to choose only one term in a 
single position (otherwise the term frequency of the document will 
increase on every update).

Uri

Fuad Efendi wrote:
> I haven't read all messages in this thread yet, but I probably have an
> answer to some questions...
>
> 1. You want to change schema.xml and to reindex, but you don't have access
> to source documents (stored somewhere on Internet). But you probably use
> stored="true" in your schema. Then, use SOLR as your storage device, use
> id:[* TO *] to retrieve documents from SOLR and reindex it in another SOLR
> schema...
>
> 2. If you don't use stored="true" you can still get access to term vectors,
> which you can probably reuse to create fake field with same term vector in
> an updated document... just an idea, may be I am wrong...
>
>
> -----Original Message-----
> From: Paul Rosen [mailto:paul@performantsoftware.com] 
> Sent: August-27-09 1:22 PM
> To: solr-user@lucene.apache.org
> Subject: Updating a solr record
>
> I realize there is no way to update particular fields in a solr record. 
> I know the recommendation is to delete the record from the index and 
> re-add it, but in my case, it is difficult to completely reindex, so 
> that creates problems with my work flow.
>
> That is, the info that I use to create a solr doc comes from two places: 
> a local file that contains most of the info, and a URL in that file that 
> points to a web page that contains the rest of the info.
>
> To completely reindex, we have to hit every website again, which is 
> problematic for a number of reasons. (Plus, those websites don't change 
> much, so it is just wasted effort.) (Once in a while we do reindex, and 
> it is a huge production to do so.)
>
> But that means that if I want to make a small change to either 
> schema.xml or the local files that I'm indexing, I can't. I can't even 
> fix minor bugs until our yearly reindexing.
>
> So, the question is:
>
> Is there any way to get the info that is already in the solr index for a 
> document, so that I can use that as a starting place? I would just tweak 
> that record and add it again.
>
> Thanks,
> Paul
>
>
>
>   

Mime
View raw message