(re-posting since it didn’t seem like my original email was sent out, my apologies if I’m mistaken)



i found a thread from Apr 2006 (http://jackrabbit.510166.n4.nabble.com/Is-doc-addition-indexing-synchronous-or-asynchronous-td528243.html). 


i find myself in a similar situation - for me, i'm adding lots of documents to the repository at once, its taking a great deal of time, the majority of that time is spent indexing and therefore i need to change my configuration or extend SearchIndex such that indexing occurs asynchronously ... i really do not have a choice.


i followed most of the thread conversation but not sure if i totally understand everything. 


(1) the thread mentions the observation events are synchronous.  it is possible to change this to be asynchronous?

(2) marcel brought up two issues with (1)

    (a) a search may not "hit" a document just added; there would be a delay

    (b) if the jvm crashed, documents not indexed yet could not be and this cannot be recovered


i can live with (a) but not (b). the thread continued on re: (b) wrt persisting what needs indexed.  that is where i started to get lost.  while (b) was mentioned, it seemed like jackrabbit handles it with a redo.log.


in any case, i need to make indexing asynchronous.  i had started down the path of extending SearchIndex and overridding the updateNodes() method but now i'm wondering if there is just a way i can configure jackrabbit to make indexing asynchronous or if there are still serious issues i have not considered. Or is extending SearchIndex and overridding the updateNodes() method what I should do?


i'm currently integrated with jackrabbit 1.6.  i'm not sure if i can upgrade to the latest version at this time but if a later version buys me something, please let me know.