tomcat-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Ludwig <mil...@gmx.de>
Subject Re: Setting encoding for tomcat compiler
Date Thu, 27 Nov 2008 02:25:47 GMT
Ronald Vyhmeister schrieb am 27.11.2008 um 08:47:07 (+0800):
> In looking through the documentation, it looks like the default
> encoding for the compiler is ISO-8859-1.

Not quite. The javac man page (1.4, 1.6 ...) has this to say:

  -encoding encoding
    Set the source file encoding name, such as EUC-JP and UTF-8. If
    -encoding is not specified, the platform default converter is used.

>  I need to use Windows-1251 (Russian input). The javac compiler takes
> an encoding option, but I have not figured out (maybe it's just too
> late) how to make it use that encoding for all files (only one
> application on the server, so no need to have multiple choices)...

Always use that option. Or define an alias, if you're on UNIX. Or write
a shell script calling javac with your options. Or if you use an IDE,
configure it accordingly.

> The database (postgresql) is UTF8, and will auto convert from WIN1251,
> but right now it's receiving the stuff as LATIN1 (8859-1)...

That doesn't have anything to do with javac, where you specify the
*source file* encoding.

An application dealing with different encodings has to be made aware of
the issue. When reading text data, always specify the correct character
encoding. If you read CP1251 and have your application believe it is
Latin-1, your results won't make much sense.

You must have code like this, which takes the encoding as parameter:

C:\dev\Java\Encoding :: more /t1 Convert.java
/*
 * Konvertiert von einer Zeichenkodierung in die andere.
 */

import java.io.*;

public class Convert {
 public static void main( String[] args) throws IOException {
  assert args.length > 3 :
   "Argumente: Quelldatei Quellkodierung Zieldatei Ziellkodierung";
  Reader in = null;
  Writer out = null;
  try {
   in = new BufferedReader(
     new InputStreamReader(
      new FileInputStream( args[0]), args[1]));
   out = new BufferedWriter(
     new OutputStreamWriter(
      new FileOutputStream( args[2]), args[3]));
   int c;
   while ( (c = in.read()) != -1 )
    out.write( c);
  }
  finally {
   if ( in  != null ) in.close();
   if ( out != null ) out.close();
  }
 }
}

C:\dev\Java\Encoding :: java -cp . Convert CP1251.txt latin1 Murks.txt
utf-8

C:\dev\Java\Encoding :: more Murks.txt
????€???­ ???°? ?Š?€? ?­???­ ?????®?­???? ???®?????? ???® ???°???¬??
???§?°?»???
®?? ?? ?????¬??? ??

Michael Ludwig

---------------------------------------------------------------------
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Mime
View raw message