poi-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nick Burch <nick.bu...@alfresco.com>
Subject Re: DO NOT REPLY [Bug 49020] "org.xml.sax.SAXParseException: </b> does not close tag <br>." when opening some Excel 2007 files
Date Wed, 05 May 2010 17:52:12 GMT
On Wed, 31 Mar 2010, Paul Spencer wrote:
>> For the long term, you should report a bug to Microsoft about this. 
>> They either need to sanitise the user input and sort out the tags (eg 
>> <br> becomes <br />), or they need to give up and escape the whole tag

>> contents for the bits where iffy data could get added (eg put this 
>> textbox within a CDATA section)
>
> I will report the but to Microsoft, but that does not address existing 
> files.

Any luck getting them to agree with the fault?

>> Medium term, we should get a list of the problem bits that Excel does wrong,
>> such as <br> (but perhaps others). Then, we need to write a XML Input Wrapper
>> that cleans these up before they get passed to the XML Processor for loading.
>> Something like this is quite nasty, though it's possible some other project out
>> there has already done it, and we can just re-use what they do.
>
> I like this as a solution.

Having just written code for this workaround, I really don't... It's 
amazingly sick code! Seems to mostly work though, certainly for your test 
file

Nick

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


Mime
View raw message