lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From gudiseashok <>
Subject Re: Rendexing problem: Indexing folder size is keep on growing for same remote folder
Date Tue, 01 Oct 2013 17:41:13 GMT
I am really sorry if something made you confuse, as I said I am indexing a
which contains mylogs.log,mylogs1.log,mylogs2.log etc, I am not indexing
them as a flat file.
I have tokenized my each line of text with regex and storing them as fields
like "messageType",

So I dont bother what file among those 4 files having this particular
content but, I just want to insert only new records.
My job routine will update these log files for every 30 minutes, and storing
each row as document. So when I reading the files after 30 minutes for
indexing,mylogs1.log content will previous version of mylog.log content. So
If a row exists with the same data,
So If I want to eliminate writing same record (from other file among those
4) again, 
Could you please suggest what do I need to do while calling add or

Do I need to run seach before inserting any row or do I have any better way
to eiliminate writing?

I really appreciate your time reading this, and thanks for responding.

View this message in context:
Sent from the Lucene - Java Users mailing list archive at

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message