lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brian Caruso <bd...@cornell.edu>
Subject Conflicts with Stemming and Wildcard / Prefix Queries
Date Fri, 23 Jun 2006 19:23:18 GMT
I was having a problem with wildcard and prefix queries not returning hits on
a stemmed field.  To solve this I overrode QueryParser to have a HashMap
of stemmed field name -> unstemmed field name and then used that map
when constructing WildcardQueries and PrefixQueries.  Now I have a Stemmed
version of a field and a unstemmed version and this QueryParser switches
between them exactly when it should.

I hope this helps someone, here is the code:

import java.util.HashMap;
import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.queryParser.*;

/**
 * This is a QueryParser that will fall back to unstemmed
 * field for WildcardQueries and PrefixQueries. 
 * @author bdc34 a cornell dot edu
 */
public class VitroQueryParser extends QueryParser {
	/** 
	 * Map from stemmed field names to the names of fields with the
	 * same terms but unstemmed.	 
	 */
	HashMap <String,String> stemmedToUnstemmed;
	
	public VitroQueryParser(String f, Analyzer a) {super(f, a);	}
	public VitroQueryParser(CharStream stream) {super(stream);	}
	public VitroQueryParser(QueryParserTokenManager tm) {super(tm);	}
	
	/**
	 * Sets the map of field name to field name where
	 * the key maps to the name of the field with the unstemmed
	 * version of the same terms.  
	 */
	public void setStemmedToUnstemmed(HashMap<String, String> stemmedToUnstemmed{
		this.stemmedToUnstemmed = stemmedToUnstemmed;
	}
	
	/** 
	 * attempts to get a field name for the unstemmed data of
	 * the given stemmedField data.  Returns stemmedField
	 * if there is not mapping in stemmedToUnstemmed. 
	 */
	public String getUnstemmed(String stemmedField){
		if( stemmedField == null || 
			stemmedToUnstemmed == null ||
			!stemmedToUnstemmed.containsKey(stemmedField))
			return stemmedField;
		else
			return stemmedToUnstemmed.get(stemmedField);		
	}
	
	@Override
	protected org.apache.lucene.search.Query getPrefixQuery(String field, String termStr) 
	throws ParseException {		
		return super.getPrefixQuery(getUnstemmed(field), termStr);
	}
	
	@Override
	protected org.apache.lucene.search.Query getWildcardQuery(String field, String termStr) 
	throws ParseException {
		return super.getWildcardQuery(getUnstemmed(field), termStr);
	}	
}

-- 
Brian Caruso
Programmer/Analyst
Albert R. Mann Library
Cornell University 
Ithaca, NY 14853

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message