cocoon-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ugo Cei <...@apache.org>
Subject Re: Removing non-xhtml tags from a string ???
Date Thu, 29 Apr 2004 08:36:15 GMT
Il giorno 28/apr/04, alle 17:20, Marcin Okraszewski ha scritto:

> Hi,
> I'm building a webapp which allows to enter XHTML via forms. The 
> problem is that I the entered XHTML *must* be valid XML. I use JTidy 
> to correct any errors that may accure. So far so good.
>
> But it turns out that if JTidy gets some tag, that it doesn't know, it 
> simply returns empty string :-( It is a bit confusing, since I use 
> HTMLArea and while pasting text from Microsoft Office, there are 
> <o:p/> tags which causes the empty string !!
>
> Is there any way to throw such tags away?

I don't know if it would really help, but you might try using CyberNeko 
[1] instead of JTidy. I've found it gives better results on average, 
particularly when dealing with [so-called] HTML pasted from Word.

	Ugo


[1] http://www.apache.org/~andyc/neko/doc/html/


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org


Mime
View raw message