jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ivan Latysh <ivanlat...@gmail.com>
Subject Re: Newbie: how do I find all nodes that have changed?
Date Sat, 16 Feb 2008 02:10:55 GMT
Chris wrote:

> I'm building a crawler that needs to find all the documents in a 
> repository. Once I do the first crawl, how do I go back later and get 
> all the documents that have changed?
> I could do a full recrawl, but I was hoping there was a faster way to 
> find the nodes that had been inserted/updated/deleted since the last crawl.

  If you use-case allow you to register a listener you can listen for 
modifications events.
  On the other hand, if you are doing a snap-shot, you can add a modified-time 
attribute to all nodes and when you need to find all updated just select nodes 
that has modified-time later than your last snap-shot.
  But this task is the same with RDBMS. How to select all updated rows from a 
table ...

Ivan Latysh

View raw message