Return-Path: Delivered-To: apmail-lucene-solr-user-archive@minotaur.apache.org Received: (qmail 22085 invoked from network); 28 Aug 2009 00:36:21 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 28 Aug 2009 00:36:21 -0000 Received: (qmail 71611 invoked by uid 500); 28 Aug 2009 00:36:19 -0000 Delivered-To: apmail-lucene-solr-user-archive@lucene.apache.org Received: (qmail 71548 invoked by uid 500); 28 Aug 2009 00:36:18 -0000 Mailing-List: contact solr-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: solr-user@lucene.apache.org Delivered-To: mailing list solr-user@lucene.apache.org Received: (qmail 71538 invoked by uid 99); 28 Aug 2009 00:36:18 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 28 Aug 2009 00:36:18 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of uboness@gmail.com designates 209.85.219.222 as permitted sender) Received: from [209.85.219.222] (HELO mail-ew0-f222.google.com) (209.85.219.222) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 28 Aug 2009 00:36:09 +0000 Received: by ewy22 with SMTP id 22so1781426ewy.28 for ; Thu, 27 Aug 2009 17:35:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from :user-agent:mime-version:to:subject:references:in-reply-to :content-type:content-transfer-encoding; bh=O6zm9RyqgRIhYMJhkDrlZdX/1Yscf73hPM1gysxzPO4=; b=w/dIWZBc09eS9Iof4/2M0szmiIPDecseZ/sX6jVjSZAcRP93O7ovIkhCVQe7Rif5Zm St8uF1BTLiDC1uZ6j50/GLkHhm5aqwfLaP1ygf3taFYolQE7YIW6hdno+qwCWXxcPTzS p7W4NAppFHnewT74letKktIpX+MWZ8fzmi32s= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:user-agent:mime-version:to:subject:references :in-reply-to:content-type:content-transfer-encoding; b=ppLZJvu4NJOWwUuK+qspW/HCTc1eL5nnjTp1tf1zuEIEzGF+hVlQxJmET81/qnwjlA o5ucCeYjPxjPiNFU55TkOe94grWO1sYGp6vT41hzyEMTI+mdZMIQbCRaMbI8+O0xJttg jDhgsebWkRDjrBF/RkNFRFk80vgeKwGmDf3YU= Received: by 10.211.173.17 with SMTP id a17mr401575ebp.78.1251419746643; Thu, 27 Aug 2009 17:35:46 -0700 (PDT) Received: from ?192.168.2.5? (e14113.upc-e.chello.nl [213.93.14.113]) by mx.google.com with ESMTPS id 28sm823262eye.4.2009.08.27.17.35.43 (version=TLSv1/SSLv3 cipher=RC4-MD5); Thu, 27 Aug 2009 17:35:45 -0700 (PDT) Message-ID: <4A97265E.4040804@gmail.com> Date: Fri, 28 Aug 2009 02:35:42 +0200 From: Uri Boness User-Agent: Thunderbird 2.0.0.23 (X11/20090817) MIME-Version: 1.0 To: solr-user@lucene.apache.org Subject: Re: Updating a solr record References: <4A96C0A0.7090102@performantsoftware.com> <000e01ca2758$5f7c56f0$1e7504d0$@ca> In-Reply-To: <000e01ca2758$5f7c56f0$1e7504d0$@ca> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org I guess if you have stored="true" then there is no problem. > 2. If you don't use stored="true" you can still get access to term vectors, > which you can probably reuse to create fake field with same term vector in > an updated document... just an idea, may be I am wrong... Reconstructing a the field value from a term enum might work... of course the value won't be as the original value, but when indexed, if you don't have any really special filters (e.g. shingle filter), most likely the tokens will be re-indexed as they are (that is, it is most likely that the filters will not have any effect). just make sure to take the position increments in account! for example, if you have synonym filter set up, then you'll need to choose only one term in a single position (otherwise the term frequency of the document will increase on every update). Uri Fuad Efendi wrote: > I haven't read all messages in this thread yet, but I probably have an > answer to some questions... > > 1. You want to change schema.xml and to reindex, but you don't have access > to source documents (stored somewhere on Internet). But you probably use > stored="true" in your schema. Then, use SOLR as your storage device, use > id:[* TO *] to retrieve documents from SOLR and reindex it in another SOLR > schema... > > 2. If you don't use stored="true" you can still get access to term vectors, > which you can probably reuse to create fake field with same term vector in > an updated document... just an idea, may be I am wrong... > > > -----Original Message----- > From: Paul Rosen [mailto:paul@performantsoftware.com] > Sent: August-27-09 1:22 PM > To: solr-user@lucene.apache.org > Subject: Updating a solr record > > I realize there is no way to update particular fields in a solr record. > I know the recommendation is to delete the record from the index and > re-add it, but in my case, it is difficult to completely reindex, so > that creates problems with my work flow. > > That is, the info that I use to create a solr doc comes from two places: > a local file that contains most of the info, and a URL in that file that > points to a web page that contains the rest of the info. > > To completely reindex, we have to hit every website again, which is > problematic for a number of reasons. (Plus, those websites don't change > much, so it is just wasted effort.) (Once in a while we do reindex, and > it is a huge production to do so.) > > But that means that if I want to make a small change to either > schema.xml or the local files that I'm indexing, I can't. I can't even > fix minor bugs until our yearly reindexing. > > So, the question is: > > Is there any way to get the info that is already in the solr index for a > document, so that I can use that as a starting place? I would just tweak > that record and add it again. > > Thanks, > Paul > > > >