ibatis-user-java mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Philippe Laflamme <philippe.lafla...@mail.mcgill.ca>
Subject Re: Help needed...
Date Thu, 21 Apr 2005 19:02:44 GMT
Daniel,

To represent every character from every alphabet, Unicode requires 
multiple bytes to represent each character (I think 21 bits are 
required, but most implementations simply use 4 bytes, 32 bits).

Since Unicode is cool you're probably interested in using it... On the 
other hand, you're probably not interested in encoding each of your 
characters using 4 bytes (quite an overkill for English for example). 
That's where UTF-8 (or UTF-16) comes in.

UTF-8 is a way of encoding the whole Unicode character set using 
sequences of 8 bit wide characters (1 byte). In other words, it uses 
multiple (from 1 to 4) 1-byte wide characters instead of one multibyte 
(32 bits) character. UTF-16 is similar, but uses sequences of 16 bits 
wide characters.

The advantage is that almost all western characters can be represented 
using one byte, just like to old days... Of course, because of special 
codes, the character set is not as large as ISO-8859-1 (for example), so 
certain latin characters we are used to require a specially crafted 
sequence of 2 (or more) bytes to represent.

For example, the letter é (0xE9 for ISO-8859-1) is represented as é 
(0xC3 0xA9 in UTF-8).

The XML standard specifies that if the encoding of a file is not 
specified, it is UTF-8 by default. So, if I were to write é in an UTF-8 
file, it wouldn't be read as 0xE9, but as an invalid sequence...

It's generaly good practice to set the ecoding of an XML file.

Hope this helps,
Philippe

Daniel H. F. e Silva wrote:
> Hi Brice,
> 
>  I know that UTF-8 is supposed to cover all kind of alphabet characters available in
Earth. But, i
> had faced this issue sometimes ago, when a parser tried to parse a file with encoding=utf-8
set
> and simply gave me an error message. After thinking about that issue, i simply changed
> encoding=utf-8 to encoding=iso-8859-1 and that fixed my problem.
>  So, if UTF-8 is really supposed to cover all kind of stuff, why did i get that error?
I'd love to
> hear your explanation.
> 
> Cheers,
>  Daniel Silva. 
> 
> --- Brice Ruth <bdruth@gmail.com> wrote:
> 
>>What special characters aren't supported by UTF-8?! I have never heard
>>of such a thing. My understanding is that UTF-8 represents the full
>>Unicode character set as a multi-byte value. And since Unicode is
>>supposed to encompass all known characters for all known languages
>>(with space for new Chinese characters created daily) - what's not
>>covered?!
>>
>>There most certainly shouldn't be anything that iso-8859-1 or latin1
>>(Windows-1252) covers that is not in Unicode.
>>
>>Brice
>>
>>On 4/20/05, Daniel H. F. e Silva <dhfs@yahoo.com> wrote:
>>
>>>You could check also your xml encoding. If you work with special charaters not
in utf-8, you
>>
>>will
>>
>>>get in trouble.
>>>I had this as my native language is portuguese and we have some special characters
not
>>
>>supported
>>
>>>by utf-8.
>>>So, if this is your case, try iso-8859-1 or one that fits better to your needs.
>>>
>>>Cheers,
>>> Daniel Silva.
>>>
>>>
>>>--- Larry Meadors <larry.meadors@gmail.com> wrote:
>>>
>>>>Make sure that there is no white space and no odd chars at the top of your
>>>>config file.
>>>>
>>>>Larry
>>>>
>>>>
>>>>On 4/18/05, KK <kkn006@gmail.com> wrote:
>>>>
>>>>>I get the following error when I try to build sqlCOnfigmap..does it
>>>>>look familiar to someone?
>>>>>
>>>>>com.ibatis.sqlmap.client.SqlMapException: There was an error while
>>>>>building the SqlMap instance.
>>>>>--- The error occurred in the SQL Map Configuration file.
>>>>>--- Cause: com.ibatis.sqlmap.client.SqlMapException: XML Parser Error.
>>>>>Cause: java.io.UTFDataFormatException: Invalid byte 3 of 3-byte UTF-8
>>>>>sequence.
>>>>>Caused by: java.io.UTFDataFormatException: Invalid byte 3 of 3-byte
>>>>>UTF-8 sequence.
>>>>>Caused by: com.ibatis.sqlmap.client.SqlMapException: XML Parser Error.
>>>>>Cause: java.io.UTFDataFormatException: Invalid byte 3 of 3-byte UTF-8
>>>>>sequence.
>>>>>Caused by: java.io.UTFDataFormatException: Invalid byte 3 of 3-byte
>>>>>UTF-8 sequence.
>>>>>at com.ibatis.sqlmap.engine.builder.xml.XmlSqlMapClientBuilder.buildSqlMap
>>>>>(XmlSqlMapClientBuilder.java:203)
>>>>>at com.ibatis.sqlmap.client.
>>>>>SqlMapClientBuilder.buildSqlMapClient(SqlMapClientBuilder.java:49)
>>>>>
>>>>>Your help is greatly appreciated.
>>>>>
>>>>>Thanks,
>>>>>KK
>>>>>
>>>>
>>>__________________________________________________
>>>Do You Yahoo!?
>>>Tired of spam?  Yahoo! Mail has the best spam protection around
>>>http://mail.yahoo.com
>>>
>>
>>
>>-- 
>>Brice Ruth
>>Software Engineer, Madison WI
>>
> 
> 
> __________________________________________________
> Do You Yahoo!?
> Tired of spam?  Yahoo! Mail has the best spam protection around 
> http://mail.yahoo.com 
> 


Mime
View raw message