lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jeff.rich...@sevajug.org
Subject Re: Query question
Date Sun, 05 Nov 2006 03:12:06 GMT
Thought I attached the code :)

package com.infinity.naxx.sandbox;

import java.io.IOException;
import java.util.Iterator;

import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.KeywordAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.index.Term;
import org.apache.lucene.queryParser.ParseException;
import org.apache.lucene.queryParser.QueryParser;
import org.apache.lucene.search.Hit;
import org.apache.lucene.search.Hits;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.PhraseQuery;
import org.apache.lucene.search.Query;
import org.apache.lucene.store.RAMDirectory;

public class LuceneTest {
	public static void main(String[] args) {
		try {
			Analyzer analyzer = new KeywordAnalyzer();
			RAMDirectory directory = new RAMDirectory();
			IndexWriter writer = new IndexWriter(directory, analyzer, true);

			Document document = new Document();
			Field location = new Field("location", "/a/b/c", Field.Store.YES,
					Field.Index.UN_TOKENIZED);
			document.add(location);
			Field name = new Field("name", "Jeff Richley", Field.Store.YES,
					Field.Index.UN_TOKENIZED);
			document.add(name);

			writer.addDocument(document);
			writer.optimize();
			writer.close();

			IndexSearcher searcher = new IndexSearcher(directory);
			QueryParser parser = new QueryParser("name", analyzer);
			Query query = parser.parse("\"Jeff Richley\"");

			System.out.println("Searching: " + query.toString());

			Hits hits = searcher.search(query);
			System.out.println("There were " + hits.length() + " hits");

			for (Iterator iter = hits.iterator(); iter.hasNext();) {
				Hit hit = (Hit) iter.next();
				System.out.println(hit.getScore() + " "
						+ hit.getDocument().get("location"));
			}
		} catch (IOException e) {
			e.printStackTrace();
		} catch (ParseException e) {
			e.printStackTrace();
		}
	}
}




> I know I am getting very close on this one but can't seem to get the score
> above .306.  My guess is that I need to do something different in my
> query.  If at all possible, could you take a quick look at my test code
> and point me in the correct direction?  I know everyone is very busy, so
> any help would be greatly appreciated.
>
>>
>> : 1.) I have data like name="Jeff" lastname="Richley" age="33" and I
>> need
>> to
>> : be able to query by any combination such as name="Jeff" age="33".  But
>> if
>> : I query with name="Jeffrey" there is no match.
>> :
>> : 2.) The name value pairs are not really controlled until the end user
>> is
>> : inserting information or querying.  I may have the data from the
>> previous
>> : example and then have another that has address information and then
>> : something totally unrelated such as stock prices.  The point is, I
>> can't
>> : guarantee what exactly will be in the data.
>>
>> Lucene will fit your needs because of your second point nicely,
>> Documents
>> don't need to all have hte same fields.
>>
>> as to your first point, and your question about 100% matches, Lucene
>> should be able to meet your needs perfectly, you just have to understand
>> how to ask it the right question.  lemme give you a quick check list of
>> things to keep in mind, and as you dig into the documentation these will
>> make more sense...
>>
>>  1) Use only UN_TOKENIZED fields when adding your documents, and if you
>> use QueryParser to build your queries for you, use the KeywordAnalyzer
>> to
>> make sure no lowercasing or stemming takes place.
>>  2) OMIT_NORMs when indexing .. they only matter if you want the lengths
>> of fields to affect the score, and you don't -- you only want to know if
>> it matched or not.
>>  3) if you want to require name="jeff" and age="33" make sure you
>> construct a query where all clauses are mandatory .. the default in the
>> query parser is "SHOULD" meaning only one clause is mandatory, and the
>> other clauses increase the score.
>>
>>
>>
>> -Hoss
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>
>
> Jeff Richley, Vice President
> Southeast Virginia Java Users Group
> jeff.richley@sevajug.org
> http://www.sevajug.org
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org


Jeff Richley, Vice President
Southeast Virginia Java Users Group
jeff.richley@sevajug.org
http://www.sevajug.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message