Return-Path: X-Original-To: apmail-lucene-solr-user-archive@minotaur.apache.org Delivered-To: apmail-lucene-solr-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id AD6E517F66 for ; Tue, 27 Jan 2015 17:12:41 +0000 (UTC) Received: (qmail 39685 invoked by uid 500); 27 Jan 2015 17:12:35 -0000 Delivered-To: apmail-lucene-solr-user-archive@lucene.apache.org Received: (qmail 39577 invoked by uid 500); 27 Jan 2015 17:12:35 -0000 Mailing-List: contact solr-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: solr-user@lucene.apache.org Delivered-To: mailing list solr-user@lucene.apache.org Received: (qmail 38849 invoked by uid 99); 27 Jan 2015 17:12:35 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 27 Jan 2015 17:12:35 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of arafalov@gmail.com designates 209.85.220.47 as permitted sender) Received: from [209.85.220.47] (HELO mail-pa0-f47.google.com) (209.85.220.47) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 27 Jan 2015 17:12:10 +0000 Received: by mail-pa0-f47.google.com with SMTP id lj1so19641769pab.6 for ; Tue, 27 Jan 2015 09:09:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=JqYbAToCbaqzS+TTw0Xrs665xx3vAEuhaG0X0OO0O+8=; b=B3+FhzQiy/Hx4w1e0HWQXEYskLMOqEZTTKX7S308D6UiMCobA5WfYtmMkritYHCBvn 3kXQaziVPqgUigbqLylfOoJ4DLOa6XRGIiCwHhdDcjMFZpK0CZD64ypDc1r/9JGoPPDV 4hxMvQlIs4zaEDsuTpfkI4yEUvcZuqCjphySofzyzUhz6hWXvGEk3gdyGYoey/8fDueK n47Nc2ZQpkLWf3k+rrZq51lpP5gmwhz+j7Xivwp0CfvEcuACYwYbTvSML4riMdLzoLf5 2Vxnsewl6ZlrY0sztHD8G5bEWKNl2OmP99Ncfb24sgv+11/9o5AZsUZi+EtcMMvvhixb KjtQ== X-Received: by 10.70.27.33 with SMTP id q1mr3890151pdg.84.1422378592252; Tue, 27 Jan 2015 09:09:52 -0800 (PST) MIME-Version: 1.0 Received: by 10.66.178.78 with HTTP; Tue, 27 Jan 2015 09:09:12 -0800 (PST) In-Reply-To: <54C7C036.4060408@gmail.com> References: <54C7BDAE.1060901@gmail.com> <54C7C036.4060408@gmail.com> From: Alexandre Rafalovitch Date: Tue, 27 Jan 2015 12:09:12 -0500 Message-ID: Subject: Re: What is the recommended way to import and update index records? To: solr-user Content-Type: text/plain; charset=UTF-8 X-Virus-Checked: Checked by ClamAV on apache.org What do you mean by "update"? If you mean partial update, DIH does not do it AFAIK. If you mean replace, it should. If you are getting duplicate records, maybe your uniqueKey is not set correctly? clean=false looks to me like the right approach for incremental updates. Regards, Alex. ---- Sign up for my Solr resources newsletter at http://www.solr-start.com/ On 27 January 2015 at 11:43, Carl Roberts wrote: > Also, if I try full-import and clean=false with the same XML file, I end up > with more records each time the import runs. How can I make SOLR just add > the records that are new by id, and update the ones that have an id that > matches the one in the existing index? > > > > On 1/27/15, 11:32 AM, Carl Roberts wrote: >> >> Hi, >> >> What is the recommended way to import and update index records? >> >> I've read the documentation and I've experimented with full-import and >> delta-import and I am not seeing the desired results. >> >> Basically, I have 15 RSS feeds that I am importing through >> rss-data-config.xml. >> >> The first RSS feed should be a full import and the ones that follow may >> contain the same id, in which case the existing id in the index should be >> updated from the record in the new RSS feed. Also there may be new records >> in the RSS feeds that follow the first one, in which case I want them added >> to the index. >> >> When I try full-import for each entity, the index is cleared and I just >> end up with the records for the last import. >> >> When I try full-import for each entity, with the clean=false parameter, >> all the records from each entity are added to the index and I end up with >> duplicate records. >> >> When I try delta-import for the entities the follow the first one, I don't >> get any new index records. >> >> How should I do this? >> >> Regards, >> >> Joe > >