lucene-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From tarunsapra <t.sapr...@gmail.com>
Subject Re: how to Index only newly added documents?
Date Wed, 04 Nov 2009 05:41:02 GMT

thanks for the reply!..

BUt  i need to filter out the already indexed documenst ...i.e if the
resouces directory contains 2 documents which are indexed , then when 2 more
documents are added then the indexed should only index the newly added
documents in the already existing index location.
Thanks

rodrigofurtado wrote:
> 
> Look the class:
> 
> org.pdfbox.searchengine.lucene.IndexFiles
> 
> This a example classe for create and indexing documents when you add or
> delete the documents into a directory.
> 
> Basicaly you indicate this when run this class:
> 
> For create de index directory try this:
> 
> java -Xms256m -Xmx512m org.pdfbox.searchengine.lucene.IndexFiles -create
> -index  <your_index_directory> <your_documents_directory>
> 
> 
> For only index directory (new or deleted files) try this (note the second
> argument '-create' is not present):
> 
> 
> java -Xms256m -Xmx512m org.pdfbox.searchengine.lucene.IndexFiles -index 
> <your_index_directory> <your_documents_directory>
> 
> 
> Bye
> 
>>
>> Hi People,
>>
>> I am stuck with a problem ,i have a resources directory in which i have
>> lot
>> of documents , my java programs picks up documents from this directory,
>> is
>> there a way using lucene APIs to recognize documents that have already
>> been
>> indexed and thus filter then out and use only newly added documents.
>>
>> Thanks
>> Tarun
>> --
>> View this message in context:
>> http://old.nabble.com/how-to-Index-only-newly-added-documents--tp26160082p26160082.html
>> Sent from the Lucene - General mailing list archive at Nabble.com.
>>
>>
> 
> 
> 
> 

-- 
View this message in context: http://old.nabble.com/how-to-Index-only-newly-added-documents--tp26160082p26191281.html
Sent from the Lucene - General mailing list archive at Nabble.com.


Mime
View raw message