lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From hg...@cswebmail.com
Subject Re: Search Expansion - one step closer ... !
Date Mon, 05 Apr 2004 11:03:19 GMT
Hi Eri<b>k</b> ;-)

Thanks for your quick reply. Basically I am using the
XML indexing example found on the web which first
parses an XML file (I have XML files) and then uses
StandardAnalyser. 
>From the XMLDocumentHandlerSax source I can see that it
is using 'text' fields which is fine for me because I
have many small XML files that do not contain too much
text and I would prefer to have all XML tags indexed as
well as stored for hit highlighting purposes.

Using StandardAnalyser is fine for my domain vocabulary
and I used my 'old' code with QueryParser and
experiments confirmed that indeed searches for "host
defense", "host-defense", "host_defense", "host
Defense" etc... all find "host defense" which is as it
stands in the XML.

So the only thing for me to do now is obviously to
apply the StandardAnalyser to the boolean query
building. I looked into your demo where you compared
different analysers and created a TokenStream through
StandardAnalyzer.
But there is an error in my query.add() - expecting
String not Stream. I know it's a pain with these stupid
guys but ... any suitable code snippet ?

I attach my code below.

When will your book be published ?

Thanks again,

Holger

public class SearchFiles1D {
	
  String[] lucene_out; 
  TokenStream stream;
  
    public String[] doSearchBQ(String index_path,
String[] myquery){
    // does query processing without QueryParser but by
contructing a boolean query	
    try {
      Searcher searcher = new IndexSearcher(index_path);
      Analyzer analyzer = new StandardAnalyzer();
	
	BooleanQuery query = new BooleanQuery();
	
	//for each term to add:
	for (int j=0; j<myquery.length; j++){
	stream = analyzer.tokenStream("contents", new
StringReader(myquery[j]));
	query.add(new TermQuery(new Term("subject", stream)),
false, false);
	}
	
	Hits hits = searcher.search(query);	
	lucene_out = new String[hits.length()];	
	for (int i = 0; i < hits.length(); i ++)
      {
	    Document doc = hits.doc(i);
	    String name = doc.get("filename");
	    lucene_out[i] = name + "|" + doc.get("subject") +
"|" + doc.get("message");
	}
      searcher.close();

    } catch (Exception e) {
      System.out.println(" caught a " + e.getClass() +
			 "\n with message: " + e.getMessage());
    }
    return lucene_out;
  }
}

___________________________________________________
The ALL NEW CS2000 from CompuServe
 Better!  Faster! More Powerful!
 250 FREE hours! Sign-on Now!
 http://www.compuserve.com/trycsrv/cs2000/webmail/





---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


Mime
View raw message