lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Hatcher <e...@ehatchersolutions.com>
Subject Re: HTML tagged terms boosting...
Date Wed, 21 Jan 2004 12:27:02 GMT
It definitely cannot be done with custom token types.  You're probably 
aiming for field-specific boosting, so you will need to parse the HTML 
into separate fields and use a multi-field search approach.

I'm sure there are other tricks that could be used for boosting, like 
inserting the words inside <b> multiple times into the same field for 
example.

	Erik


On Jan 21, 2004, at 6:50 AM, Alexey Maksakov wrote:

> Hello!
>
> Is there any idea how to achieve boosting terms in HTML-documents 
> surrounded
> by HTML tags, such as <B>, <H1>, etc.?
>
> Can it be done with use of existing API or reimplemeting or 
> implementation
> of TokenStream with custom Token types is needed?
>
> Though it seems to me, that even such re-implementation won't help 
> without
> changing indexing and searcher code... Hope that I'm wrong.
>
> Thanks in advance.
>
> Alexey.
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message