wicket-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Garret Wilson <gar...@globalmentor.com>
Subject Re: resource encoding troubles
Date Thu, 28 Aug 2014 20:56:26 GMT
On 8/28/2014 11:14 AM, Andrea Del Bene wrote:
> It's just an encoding conflict: your properties uses ISO-8859-1, your 
> page UTF-8. The result is a bad rendering, as you can see. When Java 
> designers decided to adopt ISO-8859-1 they didn't consider most of the 
> Asian languages...
> PS: just as a personal advice, try to be less "rude" in your answers ;)

Andrea, I'm sorry, I'll really try. My answers were probably terse 
(short and to the point), and you probably sense a frustration on my 
part with the lack of basic understanding in the software development 
world on the fundamentals of software encoding.

For example, your answer seems to assume that some function simply loads 
two sets of bytes and merges them together. That's not what happens at 
all. (Or at least I hope that's not what happens---it would indicate 
that the coder had no idea how to approach the task.) In fact their are 
two layers to the encoding stack: the byte-level processing, and the 
character level processing. The Java Properties class should correctly 
take the bytes in the character file and do the ISO 8859-1 encoding, 
producing a character stream to be parsed. This is already implemented 
in Java, and has been for well over a decade, I believe.

Similarly, an XML processor will take the bytes in an XML file and 
transform them based upon the encoding (in this case, UTF-8) and produce 
a stream of characters. All XML processors are required to be able to 
perform this transformation, and have been for well over a decade.

Now that bother input sources produce data and the character level, the 
original byte-level encoding is irrelevant. At the character level, 
there is no "encoding conflict", because there is no encoding. (There 
exists the in-memory encoding used by the JVM, but that's irrelevant to 
the discussion and will certainly be the same for all strings used.) 
Thus the two input streams can be mixed together without worry of 
encoding. If this is not what happens within Wicket, there is a software 
bug---but not an "encoding conflict".

I recommend you start by reading read 
http://www.joelonsoftware.com/articles/Unicode.html . If you have any 
questions, I'll be happy to answer any specific questions.

I apologize again for being brusk, but I'll do my best to explain things 
if others honestly have questions.

Garret

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@wicket.apache.org
For additional commands, e-mail: users-help@wicket.apache.org


Mime
View raw message