hc-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ian Beaumont (Created) (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HTTPCLIENT-1149) EntityUtils.toString should detect Byte order mark (BOM) and remove it if present
Date Thu, 01 Dec 2011 11:42:39 GMT
EntityUtils.toString should detect Byte order mark (BOM) and remove it if present

                 Key: HTTPCLIENT-1149
                 URL: https://issues.apache.org/jira/browse/HTTPCLIENT-1149
             Project: HttpComponents HttpClient
          Issue Type: Bug
          Components: HttpClient
    Affects Versions: 4.1.2
         Environment: Windows
            Reporter: Ian Beaumont
            Priority: Minor

The Byte order mark at the start of the input stream should be detected and removed by  EntityUtils.toString,
otherwise strange unwanted characters are left at the start.
This link lists possible Byte order markings http://en.wikipedia.org/wiki/Byte_order_mark
I'm not sure if EntityUtils.toString using the BOM to try to detect the encoding, but if it
doesn't then it should.

Example URL that is causing this issue is mircosoft virtual earth WSDL file:
HttpClient httpclient = new DefaultHttpClient();
HttpGet httpget = new HttpGet("http://dev.virtualearth.net/webservices/v1/searchservice/searchservice.svc?wsdl");
HttpResponse response = httpclient.execute(httpget);
HttpEntity entity = response.getEntity();
String textContents = EntityUtils.toString(entity);

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org

View raw message