jena-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andy Seaborne <>
Subject Re: Iussue not usual char
Date Mon, 20 Dec 2010 14:55:25 GMT

On 20/12/10 11:31, Alessandro Carrara wrote:
> Hi,
> use Jena for a project in Java.
> I have a problem in the management of some characters.
> When I try to create an attribute in XML encoding UTF-8, I pass a value as a
> String with special characters.
> I find the following result:
> In java:
> resource.addProperty (property, "SÜDBUR");

> In xml:
> <js:property>  SÃ? DBUR</ js: property>

It looks like you are viewing it as ISO-8859-1.

It all depends on what program you are using to view the XML.

Ü is Unicode \u00DC
In UTF-8 that encodes as byte C3 9C

In ISO-8859-1, C3 is à and 9C is a non-printing character (hence "? ").

Does the XML start with encoding declaration? If it does not, or if it's 
utf-8 then your XML is probably OK, and it's just the way you ar looking 
at the file.

<?xml version='1.0' encoding='utf-8'?>

If it starts

<?xml version='1.0' encoding='iso-8859-1'?>
or some such, then the file data is likely corrupt.


Do not use Java's FileWriter - it makes the encoding the platform 
default, and that is often not UTF-8.

> Can you help me?
> thanks

View raw message