From dev-return-31366-apmail-jackrabbit-dev-archive=jackrabbit.apache.org@jackrabbit.apache.org Wed May 4 20:39:16 2011 Return-Path: X-Original-To: apmail-jackrabbit-dev-archive@www.apache.org Delivered-To: apmail-jackrabbit-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 209EA3FC0 for ; Wed, 4 May 2011 20:39:16 +0000 (UTC) Received: (qmail 91331 invoked by uid 500); 4 May 2011 20:39:15 -0000 Delivered-To: apmail-jackrabbit-dev-archive@jackrabbit.apache.org Received: (qmail 91296 invoked by uid 500); 4 May 2011 20:39:15 -0000 Mailing-List: contact dev-help@jackrabbit.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@jackrabbit.apache.org Delivered-To: mailing list dev@jackrabbit.apache.org Received: (qmail 91289 invoked by uid 99); 4 May 2011 20:39:15 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 04 May 2011 20:39:15 +0000 X-ASF-Spam-Status: No, hits=3.5 required=5.0 tests=HTML_MESSAGE,SPF_PASS,URI_HEX X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of kdolan@inmedius.com designates 209.195.145.98 as permitted sender) Received: from [209.195.145.98] (HELO carp.inmedius.com) (209.195.145.98) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 04 May 2011 20:39:08 +0000 Received: from inmedius.com (falcon.inmedius.com [192.168.184.12]) by carp.inmedius.com (8.14.2/8.14.2) with ESMTP id p44KchTN001053 for ; Wed, 4 May 2011 16:38:44 -0400 X-MimeOLE: Produced By Microsoft Exchange V6.5 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01CC0A9B.42361FFF" Subject: RE: is doc addition / indexing synchronous or asynchronous? Date: Wed, 4 May 2011 16:38:43 -0400 Message-ID: <5FBB5FB921F28142B2BC6685307807FF011C45B4@Falcon.inmedius.com> In-Reply-To: <5FBB5FB921F28142B2BC6685307807FF011603AB@Falcon.inmedius.com> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: is doc addition / indexing synchronous or asynchronous? Thread-Index: AcwJzQn3UMagpOkBRlON2k5hujzXWwAzJFwQ References: <5FBB5FB921F28142B2BC6685307807FF011603AB@Falcon.inmedius.com> From: "Dolan, Kelly" To: X-Virus-Checked: Checked by ClamAV on apache.org This is a multi-part message in MIME format. ------_=_NextPart_001_01CC0A9B.42361FFF Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable =20 If I modify SearchManager such that it implements EventListener as opposed to SynchronousEventListener indexing now occurs in a background thread. If I proceed with such a change, will this break anything in Jackrabbit? i.e., is there any operation that modifies the repository, immediately does a search and expects to find what was just added and if it does not, fails? =20 Kelly =20 ________________________________ From: Dolan, Kelly [mailto:kdolan@inmedius.com]=20 Sent: Tuesday, May 03, 2011 4:03 PM To: dev@jackrabbit.apache.org Subject: is doc addition / indexing synchronous or asynchronous?=20 =20 (re-posting since it didn't seem like my original email was sent out, my apologies if I'm mistaken) =20 =20 i found a thread from Apr 2006 (http://jackrabbit.510166.n4.nabble.com/Is-doc-addition-indexing-synchro nous-or-asynchronous-td528243.html). =20 =20 i find myself in a similar situation - for me, i'm adding lots of documents to the repository at once, its taking a great deal of time, the majority of that time is spent indexing and therefore i need to change my configuration or extend SearchIndex such that indexing occurs asynchronously ... i really do not have a choice. =20 i followed most of the thread conversation but not sure if i totally understand everything. =20 =20 (1) the thread mentions the observation events are synchronous. it is possible to change this to be asynchronous? (2) marcel brought up two issues with (1) (a) a search may not "hit" a document just added; there would be a delay (b) if the jvm crashed, documents not indexed yet could not be and this cannot be recovered =20 i can live with (a) but not (b). the thread continued on re: (b) wrt persisting what needs indexed. that is where i started to get lost. while (b) was mentioned, it seemed like jackrabbit handles it with a redo.log. =20 in any case, i need to make indexing asynchronous. i had started down the path of extending SearchIndex and overridding the updateNodes() method but now i'm wondering if there is just a way i can configure jackrabbit to make indexing asynchronous or if there are still serious issues i have not considered. Or is extending SearchIndex and overridding the updateNodes() method what I should do? =20 i'm currently integrated with jackrabbit 1.6. i'm not sure if i can upgrade to the latest version at this time but if a later version buys me something, please let me know. =20 kelly =20 ------_=_NextPart_001_01CC0A9B.42361FFF Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable

 

If I modify SearchManager such that = it implements EventListener as opposed to SynchronousEventListener indexing = now occurs in a background thread.  If I proceed with such a change, will this = break anything in Jackrabbit?  i.e., is there any operation that modifies = the repository, immediately does a search and expects to find what was just = added and if it does not, fails?

 

Kelly

 


From: = Dolan, Kelly [mailto:kdolan@inmedius.com] =
Sent: Tuesday, May 03, = 2011 4:03 PM
To: dev@jackrabbit.apache.org
Subject: is doc addition = / indexing synchronous or asynchronous?

 

(re-posting since it didn’t seem like my = original email was sent out, my apologies if I’m = mistaken)

 

 

i found a thread from Apr 2006 (http://jackrabbit.510166.n4.nabble.com/Is-doc-addition-indexing-synchron= ous-or-asynchronous-td528243.html). 

 

i find myself in a similar situation - for me, i'm = adding lots of documents to the repository at once, its taking a great deal of = time, the majority of that time is spent indexing and therefore i need to = change my configuration or extend SearchIndex such that indexing occurs = asynchronously ... i really do not have a choice.

 

i followed most of the thread conversation but not = sure if i totally understand everything. 

 

(1) the thread mentions the observation events are synchronous.  it is possible to change this to be = asynchronous?

(2) marcel brought up two issues with = (1)

    (a) a search may not = "hit" a document just added; there would be a delay

    (b) if the jvm crashed, documents = not indexed yet could not be and this cannot be = recovered

 

i can live with (a) but not (b). the thread continued = on re: (b) wrt persisting what needs indexed.  that is where i started to = get lost.  while (b) was mentioned, it seemed like jackrabbit handles = it with a redo.log.

 

in any case, i need to make indexing = asynchronous.  i had started down the path of extending SearchIndex and overridding the updateNodes() method but now i'm wondering if there is just a way i can configure jackrabbit to make indexing asynchronous or if there are still serious issues i have not considered. Or is extending SearchIndex and overridding the updateNodes() method what I should = do?

 

i'm currently integrated with jackrabbit 1.6.  = i'm not sure if i can upgrade to the latest version at this time but if a later = version buys me something, please let me know.

 

kelly

 

------_=_NextPart_001_01CC0A9B.42361FFF--