cocoon-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ugo Cei <>
Subject Re: Removing non-xhtml tags from a string ???
Date Thu, 29 Apr 2004 08:36:15 GMT
Il giorno 28/apr/04, alle 17:20, Marcin Okraszewski ha scritto:

> Hi,
> I'm building a webapp which allows to enter XHTML via forms. The 
> problem is that I the entered XHTML *must* be valid XML. I use JTidy 
> to correct any errors that may accure. So far so good.
> But it turns out that if JTidy gets some tag, that it doesn't know, it 
> simply returns empty string :-( It is a bit confusing, since I use 
> HTMLArea and while pasting text from Microsoft Office, there are 
> <o:p/> tags which causes the empty string !!
> Is there any way to throw such tags away?

I don't know if it would really help, but you might try using CyberNeko 
[1] instead of JTidy. I've found it gives better results on average, 
particularly when dealing with [so-called] HTML pasted from Word.



To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message