lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From gudiseashok <>
Subject Rendexing problem: Indexing folder size is keep on growing for same remote folder
Date Tue, 01 Oct 2013 00:01:12 GMT

I am reading log files a remote folder and keeping them in local folder and
then I am indexing as I shown below,  how ever I am not saving the whole
content as input stream, I am splitting them with Grok regex and saving as
below strings.

I am repeating this "copy and indexing" process for every 30 minutes (as a
cron job), and my index folder size is getting doubled on completion each
batch. I am using Lucene 4.4 version, after looking at the addDocument
implementation in IndexWriter, I assume that it is calling update though I
call addDocument with (CREATE_OR_APPEND). 

Kindly suggest right approach if I am not doing in correct way, my
requirement is to update that log folder content for every 30 minutes and
re-indexing the content.

Thanks for taking time to read this, and please see my code configuration
snippet below...

//Adding document
 document.add(new StringField("className", logsVO.getClassName(),
 document.add(new StringField("logLevel", logsVO.getLogLevel(),
 document.add(new TextField("logMessage", logsVO.getLogMessage(),
 document.add(new StringField("messageType",
logsVO.getMessageType().toString(), Field.Store.NO));
 document.add(new LongField("timeStamp", logsVO.getTimeStamp().getTime(),
 IndexWriter writer =  luceneUtil.getIndexWriter();

//addDocument is calling IndexWriter's (API class)   below mentioned method,
public void addDocument(Iterable<? extends IndexableField> doc) throws
IOException {
    addDocument(doc, analyzer);

//Writer creation approach...

        Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_44);
        IndexWriterConfig iwc = new IndexWriterConfig(Version.LUCENE_44,
        this.writer = new IndexWriter(dir, iwc);

View this message in context:
Sent from the Lucene - Java Users mailing list archive at

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message