xml-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From erik...@daimi.au.dk
Subject JAXP problems: Latin-1 encoded files, and DTDs in jar files
Date Mon, 08 Apr 2002 02:10:34 GMT
Hello all,
I've run into problems using the SAX parser in JAXP.

The first is that it seems impossible to get it to parse files encoded in
something else than UTF-8 - in my case, I want it to read Latin1 (ISO-8859-1).

What I get is an
org.xml.sax.SAXParseException: Character conversion error: "Malformed UTF-8 char
-- is an XML encoding declaration missing?" (line number may be too low).

I have tried inserting encoding="ISO-8859-1" (or encoding="Latin-1") in the xml
DOCTYPE tag and in the DTD <?xml?> tag. (Actually, even if I supply nonsense
encoding names, I get the same error message - it seems to be ignored altogether).

I've also tried somthing like
   InputSource is = new InputSource(new FileInputStream(filename));
   is.setEncoding("ISO-8859-1");
   xmlReader.parse(filename);
but without success.

As my editor does not support UTF-8, I'd really like to be able to write Latin1
douments. Is there a solution?


The second problem is that our java program is going to be put into a jar file,
along with its data: some XML files and their DTD.
I know how to make the XmlParser read the XML files, but they cannot find the
DTD (and I can't just specify an URL in the DOCTYPE tag of the XML files, as the
URL for jar resources are platform dependent). How do you make to parser search
for the DTD among the resources in the jar file?
(Our program has no problems with file locations before being jarred, so I'd of
course like to know a minimal-impact solution, if one exists.)

    Erik Søe Sørensen
--
Lights. Panic. Action.

---------------------------------------------------------------------
In case of troubles, e-mail:     webmaster@xml.apache.org
To unsubscribe, e-mail:          general-unsubscribe@xml.apache.org
For additional commands, e-mail: general-help@xml.apache.org


Mime
View raw message