commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gary Gregory <>
Subject [IO] BOMInputStream bug?
Date Fri, 10 Aug 2012 17:44:27 GMT
Hi All:

Does anyone have expertise with BOMInputStream?

I know that some XML parsers (like the one shipped with the Oracle JRE) do
not detect UTF-32 BOMs (UTF-8 and UTF-16 BOMs are OK) but using
BOMInputStream is supposed to fix the issue.

These tests I added and @Ignore'd fail:


More basic tests do work:


When I look at the Oracle JRE (which uses a copy of Xerces) I see code to
deal with UCS-4, which is a precursor to UTF-32, like UCS-2 is a subset to
UTF-16, but as the test shows, Xerces fail parsing a UTF-32 document.

Any thoughts?
Thank you,

E-Mail: |
JUnit in Action, 2nd Ed: <http://goog_1249600977>
Spring Batch in Action: <>

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message