lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vikas Gupta <>
Subject Customizing termFreq
Date Sun, 12 Dec 2004 08:17:49 GMT
Hi developers,

I am indexing HTML documents in lucene as:

H1:"text in H1 font"
H2:"text in H2 font"
H6:"text in H6 font"
content:"all the text"

The problem is that query of a type
is getting scored with the termFreq of xyz in the H1 field whereas I want
it be scored using the termFreq of xyz in the entire document (i.e.
content field)

Can you point me how to achieve this.

I took a look at Similarity class. It does have a tf() function but it is
actually passed a termFreq value.

Thanks a lot.

PS: I am using lucene for a class project where I am trying to utilize
font information of HTML documents. I am boosting the scores for matches
in H6 field over matches in H5 and so on.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message