lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Hatcher <>
Subject Re: HTML tagged terms boosting...
Date Wed, 21 Jan 2004 12:27:02 GMT
It definitely cannot be done with custom token types.  You're probably 
aiming for field-specific boosting, so you will need to parse the HTML 
into separate fields and use a multi-field search approach.

I'm sure there are other tricks that could be used for boosting, like 
inserting the words inside <b> multiple times into the same field for 


On Jan 21, 2004, at 6:50 AM, Alexey Maksakov wrote:

> Hello!
> Is there any idea how to achieve boosting terms in HTML-documents 
> surrounded
> by HTML tags, such as <B>, <H1>, etc.?
> Can it be done with use of existing API or reimplemeting or 
> implementation
> of TokenStream with custom Token types is needed?
> Though it seems to me, that even such re-implementation won't help 
> without
> changing indexing and searcher code... Hope that I'm wrong.
> Thanks in advance.
> Alexey.
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message