Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 99661 invoked from network); 7 Aug 2006 13:08:36 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 7 Aug 2006 13:08:36 -0000 Received: (qmail 71303 invoked by uid 500); 7 Aug 2006 13:08:28 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 71264 invoked by uid 500); 7 Aug 2006 13:08:28 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 71195 invoked by uid 99); 7 Aug 2006 13:08:28 -0000 Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 07 Aug 2006 06:08:28 -0700 X-ASF-Spam-Status: No, hits=3.7 required=10.0 tests=DNS_FROM_RFC_ABUSE,DNS_FROM_RFC_POST,DNS_FROM_RFC_WHOIS,HTML_10_20,HTML_MESSAGE X-Spam-Check-By: apache.org Received-SPF: pass (asf.osuosl.org: local policy) Received: from [209.191.91.198] (HELO web36406.mail.mud.yahoo.com) (209.191.91.198) by apache.org (qpsmtpd/0.29) with SMTP; Mon, 07 Aug 2006 06:08:27 -0700 Received: (qmail 79129 invoked by uid 60001); 7 Aug 2006 13:08:05 -0000 DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=Message-ID:Received:Date:From:Subject:To:In-Reply-To:MIME-Version:Content-Type:Content-Transfer-Encoding; b=wA3mpBirixSsC3yjr5bDXov222N1vZoOrbBGCvHEMG9XOHQ8240gx/dbZgXDB6sP9nxwso/OkcazKnPhF0aljWkhtL6SzK0urfLnf+KbyPp+c32JHAXfhO5pt2YhG8okaaKSNXOvwVgti9f2tdiltPuFmE6Qy5VVQZjN1TxYAGk= ; Message-ID: <20060807130805.79127.qmail@web36406.mail.mud.yahoo.com> Received: from [208.210.219.129] by web36406.mail.mud.yahoo.com via HTTP; Mon, 07 Aug 2006 06:08:05 PDT Date: Mon, 7 Aug 2006 06:08:05 -0700 (PDT) From: vasu shah Subject: Re: Modify index on database update To: java-user@lucene.apache.org In-Reply-To: <44D21553.6060604@mikemccandless.com> MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="0-629505451-1154956085=:78534" Content-Transfer-Encoding: 8bit X-Virus-Checked: Checked by ClamAV on apache.org X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N --0-629505451-1154956085=:78534 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit Thanks Michael. You explained it very nice. I will look into the third approach. The first and second approach are not feasible for me. Thanks again. -Vasu Michael McCandless wrote: > My application database can be updated outside the application also. Whenever there is a change in database by some other source, I want to update my index. > > Is there any way to do so? > > I am using Java and the database is DB2. I saw the DB2 UDF. But I have to put the jar inside the DB2 installation directory. I dont know how to communicate with my application/update index. Yes, this certainly is possible. The simplest approach would be to rebuild the entire index periodically, however, this is not very efficient. A better approach is to reindex just the rows that have changed since you last indexed. To do this, first, you need to implement your own method of tracking which rows of which tables have changed inside DB2. Perhaps you could use a timestamp column and then issue a SELECT WHERE timestamp > last-index-time. Make sure last-index-time is DB2's timestamp to avoid clock skew issues. Then, for each changed document, you need to add the new doc and remove the old one (Lucene doesn't have a "replace document" ). Typically you would index the primary key into Lucene, and then use IndexReader's or IndexModifier's 'deleteDocuments(Term term)' to delete the document by its primary key. Then add the new document. If you have a group of documents that need re-indexing, it's best to first delete all of them, and then re-add all of them (ie, bunch the deletes & adds). This gives better performance. The third (and likely best, depending on tradeoffs) approach is to have some sort of "push" or "triggers" coming out of DB2 that notifies you whenever a row is changed. If UDF inside DB2 allows for this that would be great (I don't know anything about UDF -- likely you'd need to roll your own such communication coming out of DB2). Then, you re-index each document when it's changed instead of polling periodically. In any case, make sure you close/re-open your IndexSearchers periodically so that they see the newly deleted/added documents. And make sure the lock directory is the same directory across all of your processes, and, it is a directory in a locally mounted file system (ie not NFS or Samba) as there known issues for those. Mike --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org --------------------------------- How low will we go? Check out Yahoo! Messenger�s low PC-to-Phone call rates. --0-629505451-1154956085=:78534--