ibatis-user-java mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shannon, Bryan" <BShan...@Tribune.com>
Subject RE: Unicode support for ibatis
Date Mon, 17 Apr 2006 16:45:08 GMT
I have no trouble using unicode with iBatis, and didn't have to do any
configuration for it to work;

However, I use Sybase and ran into a similar problem with unicode and java
although not related to iBatis, so maybe this will help.

You should connect to oracle saying "I'm Unicode, and I want a UTF8
character set".  Basically what you are trying to do is prevent oracle from
changing the character set in the first place.  If you have unicode data in
the database, then connecting to it using jdbc and setting the parameter you
mention below might be telling oracle to translate the data to some
non-unicode character set.  You don't want it to do that, since the Spanish
character strings in oracle *are* in spanish, but they are *also* still in
unicode encoding.

So, from here, I would check a couple of things:

1) Verify that the raw data in the database is ACTUALLY in correct UTF8
encoding.  (This could get messed up if someone INSERTed the value saying
that the string was a certain character set even though it was not that
character set.)
	If I connect to the database saying "I'm unicode" and I insert a
string that is actually in ISO_1 format (perhaps from the oracle commandline
client or an infinite other non-utf8 sources) then oracle will NOT no the
difference; when a client connects saying "I'm unicode", then the server
will not try to do any translation of character sets since what is in the
data (utf8) matches what the client wants (utf8) but the DATA is *still
iso_1* !!!

	Essentially if the character set of the data in the database doesn't
match what the server *thinks* it is, then it will *never* be able to do a
conversion of one character set to another.  Likewise, if the client
characterset matches the server character set, no conversion will ever be
done, but you still won't be getting utf8 data!

2) Check the above by connecting with straight jdbc, connecting as UTF8, and
verify that the bytes you pull out are correct utf8.  
3) If you can, connect with straight jdbc, connecting as iso_1 and verify
that you still get the expected characters. (as long as they are spanish
characters, oracle will have no problem converting from utf8 back into
iso_1.) Check the bytes!  Verify that connecting as iso_1 and pulling out
the upside-down question mark gives you a byte value of 168!
4) Always examine the data in the above steps using a unicode-capable string
viewer (JTextArea works nicely).  You don't even want to get into the
problem of your text-terminal's font not matching the data your pulling out!

My trouble happend a while back when all of our IBM systems (using native
codepage 850 character set) were all inserting/selecting data from the
database.  The database's internal character set was iso_1, and all the
applications connecting to the server where connecting saying "I'm ISO_1"
[they were lying!].  Therefore, the server did no actual conversion between
between cp850 and iso_1... The data that was put in as cp850 came out as
cp850... Even though every configuration option  for both the server and
database clients lead you to believe that the only thing we ever dealt with
was iso_1!  Problem was, when you connected to the database requesting a
certain character set, (such as utf8), then it would try to do the
conversion from iso_1 to utf8... Conversion obviously failed. This caused us
HUGE headaches when we switched to UTF8 as the database server's native
character set.  You have to ensure that if you're specifying a character set
when INSERTING, that the data is ACTUALLY in that character set.  Once it
gets in there the wrong way, the server will NEVER be able to convert it
from its raw byte interpretation to anything else.  This situation appears
to be somewhat similar to yours!

To Summarize:  Make absolutely sure that the database data is actually in
the character set your oracle server thinks it is.

Best of luck!  (phew!!)

-Bryan Shannon

-----Original Message-----
From: Juan CaƱadas [mailto:jcgb219@tid.es]
Sent: Monday, April 17, 2006 4:41 AM
To: user-java@ibatis.apache.org
Subject: Unicode support for ibatis


  i'm working with ibatis, struts and Oracle 9... when I get data 
published by a form, UTF-8 encoded, it's stored
in a nvarchar2 column (Oracle 9)... When I display this data, it's 
displayed as "????" in the html code...

 I'm working with Oracle's thin client, 
(NLS_LANG="SPANISH_SPAIN.AL32UTF8")... what's wrong?
Has ibatis support for unicode? must I configure anything more? is there 
some parameter to ibatis work with
utf8? is there some configuration example ibatis - oracle - unicode?


View raw message