lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stefan Groschupf ...@media-style.com>
Subject Re: HTML tag filter...
Date Sat, 10 Jan 2004 19:00:30 GMT
If you browse the cvs of nutch.org you will found an implementation.

HTH
Stefan


Am 10.01.2004 um 19:43 schrieb ambiesense@gmx.de:

> Hi group,
>
> would it be possible to implement a Analyser who filters HTML code out 
> of a
> HTML page. As a result I would have only the text free of any tagging.
>
> Is is maybe better to use other existing open source software for 
> that? Did
> somebody tried that here?
>
> Cheers,
> Ralf
>
> -- 
> +++ GMX - die erste Adresse für Mail, Message, More +++
> Neu: Preissenkung für MMS und FreeMMS! http://www.gmx.net
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message