lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yakob <jacob...@opensuse-id.org>
Subject Re: flushing index
Date Tue, 28 Sep 2010 06:18:03 GMT
On 9/27/10, Uwe Schindler <uwe@thetaphi.de> wrote:
>
>
> Yes. You must close before, else the addIndexes call will do nothing, as the
> index looks empty for the addIndexes() call (because no committed segments
> are available in the ramDir).
>
> I don't understand what you mean with flushing? If you are working on Lucene
> 2.9 or 3.0, the ramWriter is flushed to the RAMDir on close. The addIndexes
> call will add the index to the on-disk writer. To flush that fsWriter (flush
> is the wrong thing, you probably mean commit), simply call fsWriter.commit()
> so the newly added segments are written to disk and IndexReaders opened in
> parallel "see" the new segments.
>
> Btw: If you are working on Lucene 3.0, the addIndexes call does not need the
> new Directory[] {}, as the method is Java 5 varargs now.
>
> Uwe
>
>

I mean I need to flush the index periodically.that's mean that the
index will be regularly updated as the document being added.what do
you reckon is the solution for this? I need a sample source code to be
able to flush an index.

ok just like this source code below.

public class SimpleFileIndexer {
	
	public static void main(String[] args) throws Exception {
		
		File indexDir = new
File("C:/Users/Raden/Documents/lucene/LuceneHibernate/adi");
		File dataDir = new
File("C:/Users/Raden/Documents/lucene/LuceneHibernate/adi");
		String suffix = "txt";
		
		SimpleFileIndexer indexer = new SimpleFileIndexer();
		
		int numIndex = indexer.index(indexDir, dataDir, suffix);
		
		System.out.println("Total files indexed " + numIndex);
		
	}
	
	private int index(File indexDir, File dataDir, String suffix) throws
Exception {
		
		IndexWriter indexWriter = new IndexWriter(
				FSDirectory.open(indexDir),
				new SimpleAnalyzer(),
				true,
				IndexWriter.MaxFieldLength.LIMITED);
		indexWriter.setUseCompoundFile(false);
		
		indexDirectory(indexWriter, dataDir, suffix);
		
		int numIndexed = indexWriter.maxDoc();
		indexWriter.optimize();
		indexWriter.close();
		
		return numIndexed;
		
	}
	
	private void indexDirectory(IndexWriter indexWriter, File dataDir,
String suffix) throws IOException {
		File[] files = dataDir.listFiles();
		for (int i = 0; i < files.length; i++) {
			File f = files[i];
			if (f.isDirectory()) {
				indexDirectory(indexWriter, f, suffix);
			}
			else {
				indexFileWithIndexWriter(indexWriter, f, suffix);
			}
		}
	}
	
	private void indexFileWithIndexWriter(IndexWriter indexWriter, File
f, String suffix) throws IOException {
		if (f.isHidden() || f.isDirectory() || !f.canRead() || !f.exists()) {
			return;
		}
		if (suffix!=null && !f.getName().endsWith(suffix)) {
			return;
		}
		System.out.println("Indexing file " + f.getCanonicalPath());
		
		Document doc = new Document();
		doc.add(new Field("contents", new FileReader(f)));		
		doc.add(new Field("filename", f.getCanonicalPath(), Field.Store.YES,
Field.Index.ANALYZED));
		
		indexWriter.addDocument(doc);
	}

}


the above source code can index documents when given the directory of
text files. now what I am asking is how can I made the code to run
continuously? what class should I use? so that everytime there is new
documents added to that directory then lucene will index those
documents automatically, can you help me out on this one. I really
need to know what is the best solution.

thanks
-- 
http://jacobian.web.id

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message