lucene-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From rodrigofurt...@saneago.com.br
Subject Re: how to Index only newly added documents?
Date Tue, 03 Nov 2009 12:26:48 GMT
Look the class:

org.pdfbox.searchengine.lucene.IndexFiles

This a example classe for create and indexing documents when you add or
delete the documents into a directory.

Basicaly you indicate this when run this class:

For create de index directory try this:

java -Xms256m -Xmx512m org.pdfbox.searchengine.lucene.IndexFiles -create
-index  <your_index_directory> <your_documents_directory>


For only index directory (new or deleted files) try this (note the second
argument '-create' is not present):


java -Xms256m -Xmx512m org.pdfbox.searchengine.lucene.IndexFiles -index 
<your_index_directory> <your_documents_directory>


Bye

>
> Hi People,
>
> I am stuck with a problem ,i have a resources directory in which i have
> lot
> of documents , my java programs picks up documents from this directory, is
> there a way using lucene APIs to recognize documents that have already
> been
> indexed and thus filter then out and use only newly added documents.
>
> Thanks
> Tarun
> --
> View this message in context:
> http://old.nabble.com/how-to-Index-only-newly-added-documents--tp26160082p26160082.html
> Sent from the Lucene - General mailing list archive at Nabble.com.
>
>



Mime
View raw message