lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "mohamed ebrahim faisal" <ebrahim_faisal...@hotmail.com>
Subject Re: Deleting index for DB indexing
Date Fri, 31 Dec 2004 14:05:39 GMT
Hi

U can try out the following code to delete document based on KeyWords


import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.index.Term;
import org.apache.lucene.index.TermDocs;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.queryParser.QueryParser;
import org.apache.lucene.search.Hits;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.index.IndexReader;
import org.apache.lucene.search.Query;
import org.apache.lucene.search.TermQuery;

import org.apache.lucene.search.Searcher;

public class LuceneDelete
{
	private static final String[] strSTOP_WORDS =
        {
			"and",
			"are"
			 };
	private void test() throws Exception
	{
		Analyzer objAnalyzer = new StandardAnalyzer();
		IndexWriter index = new IndexWriter("index",objAnalyzer, true );


		Document objDocument = new Document();

		objDocument.add( Field.Keyword("name","Ebrahim Faisal"));
		objDocument.add( Field.Text("address","Chennai"));
		objDocument.add( Field.Keyword("designation","Software Engineer"));
		objDocument.add( Field.UnIndexed("xyz","123 IndexWriter index"));

		index.addDocument( objDocument );

		objDocument = new Document();

		objDocument.add( Field.Keyword("name","John Smith"));
		objDocument.add( Field.Text("address","Delhi"));
		objDocument.add( Field.Keyword("designation","Sr. Software Engineer"));
		objDocument.add( Field.UnIndexed("xyz","456 StandardAnalyzer true"));

		index.addDocument( objDocument );

		index.optimize();
		index.close();

		//Logic for deleting

		IndexReader objIndexReader = IndexReader.open("index");

		TermDocs objTermDocs = objIndexReader.termDocs(new Term("name","Ebrahim 
Faisal"));

		while( objTermDocs.next() )
		{
			int docNum = objTermDocs.doc();
			objDocument = objIndexReader.document( docNum );
			if( objDocument.get("designation").equalsIgnoreCase("Software Engineer"))
			{
				objIndexReader.delete( docNum );
			}
		}
		objIndexReader.close();


		Searcher objIndexSearcher = new IndexSearcher("index");

		Query objQuery = null;

		objQuery = QueryParser.parse("Delhi", "address"
              , objAnalyzer);


		Hits objHits = objIndexSearcher.search(objQuery);

		System.out.println(" objHits "+objHits.length());

		for (int nStart = 0; nStart < objHits.length(); nStart++)
		{
			objDocument = objHits.doc(nStart);
			System.out.println(" address "+objDocument.get("address"));
		}
		objIndexSearcher.close();
		objIndexSearcher = null;


	}
	public static void main(String[] args) throws Exception
	{
		new LuceneDelete().test();
	}
}




E.FAISAL




>From: mahaveer jain <jainmahaveer23@yahoo.com>
>Reply-To: "Lucene Users List" <lucene-user@jakarta.apache.org>
>To: Lucene Users List <lucene-user@jakarta.apache.org>,  Paul 
><paul.fuehring@gmail.com>
>Subject: Re: Deleting index for DB indexing
>Date: Thu, 30 Dec 2004 21:17:48 -0800 (PST)
>
>Thanks Paul,
>
>You idea seems to be good. I ll try that. I have one more question. Should 
>the new key what I create have to be keyword ? or Can it be just a column 
>in the index ?
>
>Mahaveer
>
>Paul <paul.fuehring@gmail.com> wrote:
>On Thu, 30 Dec 2004 08:36:04 -0800 (PST), mahaveer jain
>wrote:
> > I am indexing more that 5 tables. And each for them have autoincrement 
>and
> > that is the primary key. So if I do find DocNum, it may so happen that 
>it
> > may delete document I don't want to delete.
>
>you need to create your own global ID, I had the same problem (but I
>used a MD5 hashvalue). One solution ist to give each of your tables an
>internal number and when creating your lucene-documents you add an
>additional field with something like "dbInternalId*100+dbNumber" so
>that db-record 5 in table 3 results in 503. when documents from your
>DB are deleted and you need to update the index you simple create a
>term which's value is calculated the same way and delete the document
>with the IndexReader.delete(Term)
>Instead of calculating you can do string concatenating as well :)
>
>Paul
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
>For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>
>
>__________________________________________________
>Do You Yahoo!?
>Tired of spam?  Yahoo! Mail has the best spam protection around
>http://mail.yahoo.com

_________________________________________________________________
The MS Office product suite. Make efficiency a habit. 
http://www.microsoft.com/india/office/experience/  Simplify your life.


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message