lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ahmet Arslan <iori...@yahoo.com>
Subject Re: What is the best Analyzer and Parser for this type of question?
Date Mon, 15 Nov 2010 23:32:22 GMT

> Example of Question:
> - What is the role of PrnP in mad cow disease?

First thing is do not directly query questions. Manually formulate queries:
remove 'what' 'is' 'the' 'of' '?' etc.

For example i would convert this question into:

"mad cow"^5 "cow disease"^3 "mad cow disease"^15 "role PrnP"~5^2 "role mad cow disease"~45
mad^0.1 role^0.5 cow disease PrnP^10

> I am running in 11.638 documents and the result is 10410
> docs for this question (lowwwwww precision)

Use OR default operator, collect and evaluate top 1000 documents only.

And instead of Porter you can try KStem.
http://ciir.cs.umass.edu/cgi-bin/downloads/downloads.cgi

Try different length normalization described here. Also their Lucene query example (SpanNear)
can inspire you.  http://trec.nist.gov/pubs/trec16/papers/ibm-haifa.mq.final.pdf


      

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message