hc-httpclient-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brad Hadfield <brad.hadfi...@pitchpointsolutions.com>
Subject Character Encoding UTF-8
Date Wed, 16 Mar 2005 21:10:13 GMT

I would greatly appreciate your help.

I am aware that an earlier post mentions problems with character 
encodings. I've also read the material on the httpclient site. I can only
assume that I am doing something incorrectly or my problems are due to a 
lack of understanding concerning character encoding.

Specifically we are sending XML in the body of a post.  As an example of 
the kind of output I am getting I have created a small example.

The following code segment results in the log out-put below:

    String characters =
        "These are the characters - Abcde;+[XáÇëèÔ] - some may fail.";

    log.debug("Content String: " + characters);

    URL url = new URL("http://localhost:1234/ppsi/xxx.hxi");
    PostMethod post = new PostMethod();
    post.setRequestHeader("Content-type", "text/plain; charset=UTF-8");

    HttpClient cl = new HttpClient();
    cl.getHostConfiguration().setHost(url.getHost(), url.getPort());


main           | DEBUG test.PPSTest          Start.

main           | DEBUG test.PPSTest          Content String: These are 
the characters - Abcde;+[XáÇëèÔ] - some may fail.

main           | DEBUG httpclient.wire       >> "POST / HTTP/1.1[\r][\n]"

main           | DEBUG httpclient.wire       >> "Content-type: 
text/plain; charset=UTF-8[\r][\n]"

main           | DEBUG httpclient.wire       >> "User-Agent: Jakarta 

main           | DEBUG httpclient.wire       >> "Host: 

main           | DEBUG httpclient.wire       >> "Content-Length: 59[\r][\n]"

main           | DEBUG httpclient.wire       >> "[\r][\n]"

main           | DEBUG httpclient.wire       >> "These are the 
characters - 

- some may "

main           | DEBUG httpclient.wire       << "HTTP/1.1 400 No Host 
matches server name localhost[\r][\n]"

main           | DEBUG httpclient.wire       << "Transfer-Encoding: 

main           | DEBUG httpclient.wire       << "Date: Wed, 16 Mar 2005 
20:56:59 GMT[\r][\n]"

main           | DEBUG httpclient.wire       << "Server: 

main           | DEBUG httpclient.wire       << "Connection: close[\r][\n]"

main           | DEBUG test.PPSTest          end.


Why do the Unicode "placeholder" characters result? Shouldn't the UTF-8 
Encoding be able to handle them? Why is the output truncated?


To unsubscribe, e-mail: httpclient-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: httpclient-user-help@jakarta.apache.org

View raw message