lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ian Lea <>
Subject Re: Rendexing problem: Indexing folder size is keep on growing for same remote folder
Date Wed, 02 Oct 2013 09:14:08 GMT
Yes, as I suggested, you could search on your unique id and not index
if already present.  Or, as Uwe suggested, call updateDocument instead
of add, again using the unique id.


On Tue, Oct 1, 2013 at 6:41 PM, gudiseashok <> wrote:
> I am really sorry if something made you confuse, as I said I am indexing a
> folder
> which contains mylogs.log,mylogs1.log,mylogs2.log etc, I am not indexing
> them as a flat file.
> I have tokenized my each line of text with regex and storing them as fields
> like "messageType",
> "timeStamp","message".
> So I dont bother what file among those 4 files having this particular
> content but, I just want to insert only new records.
> My job routine will update these log files for every 30 minutes, and storing
> each row as document. So when I reading the files after 30 minutes for
> indexing,mylogs1.log content will previous version of mylog.log content. So
> If a row exists with the same data,
> So If I want to eliminate writing same record (from other file among those
> 4) again,
> Could you please suggest what do I need to do while calling add or
> updateDocument?
> Do I need to run seach before inserting any row or do I have any better way
> to eiliminate writing?
> I really appreciate your time reading this, and thanks for responding.
> --
> View this message in context:
> Sent from the Lucene - Java Users mailing list archive at
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message