lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From gudiseashok <gudise.as...@gmail.com>
Subject Re: Rendexing problem: Indexing folder size is keep on growing for same remote folder
Date Tue, 01 Oct 2013 17:41:13 GMT
I am really sorry if something made you confuse, as I said I am indexing a
folder 
which contains mylogs.log,mylogs1.log,mylogs2.log etc, I am not indexing
them as a flat file.
I have tokenized my each line of text with regex and storing them as fields
like "messageType",
"timeStamp","message".

So I dont bother what file among those 4 files having this particular
content but, I just want to insert only new records.
My job routine will update these log files for every 30 minutes, and storing
each row as document. So when I reading the files after 30 minutes for
indexing,mylogs1.log content will previous version of mylog.log content. So
If a row exists with the same data,
So If I want to eliminate writing same record (from other file among those
4) again, 
Could you please suggest what do I need to do while calling add or
updateDocument?

Do I need to run seach before inserting any row or do I have any better way
to eiliminate writing?

I really appreciate your time reading this, and thanks for responding.



--
View this message in context: http://lucene.472066.n3.nabble.com/Rendexing-problem-Indexing-folder-size-is-keep-on-growing-for-same-remote-folder-tp4092835p4092990.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message