lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ahmet Arslan <>
Subject Re: What is the best Analyzer and Parser for this type of question?
Date Mon, 15 Nov 2010 23:32:22 GMT

> Example of Question:
> - What is the role of PrnP in mad cow disease?

First thing is do not directly query questions. Manually formulate queries:
remove 'what' 'is' 'the' 'of' '?' etc.

For example i would convert this question into:

"mad cow"^5 "cow disease"^3 "mad cow disease"^15 "role PrnP"~5^2 "role mad cow disease"~45
mad^0.1 role^0.5 cow disease PrnP^10

> I am running in 11.638 documents and the result is 10410
> docs for this question (lowwwwww precision)

Use OR default operator, collect and evaluate top 1000 documents only.

And instead of Porter you can try KStem.

Try different length normalization described here. Also their Lucene query example (SpanNear)
can inspire you.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message