lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Hatcher <e...@ehatchersolutions.com>
Subject Re: How do i prevent the HTML tags being added to Lucene Index..
Date Thu, 20 May 2004 09:39:28 GMT
Also, have a look at the jakarta-lucene-sandbox CVS repository in 
contributions/ant.  It indexes HTML content using JTidy to strip tags.

	Erik


On May 20, 2004, at 1:42 AM, Mahesh wrote:

> I am using the lucene 1.4 to index the information.
> I have lot of HTML tags in the information that i will be indexing ,so
> let me know if their is any way of removing the HTML tags from being
> indexed..
>
>
> MAHESH
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message