hc-httpclient-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Khosro Asgharifard Sharabiani <khosro_quest...@yahoo.com>
Subject Obtaining charset of page from HttpResponse.
Date Tue, 16 Aug 2011 11:42:07 GMT
Hello,
I use the following code to find charset of a page,but it does not worked for page "http://www.annahar.com/content.php?priority=1&table=main&type=main&day=Mon"

Code : 
 [code]

try {
HttpClient httpclient = new DefaultHttpClient();
String url="http://www.annahar.com/content.php?priority=1&table=main&type=main&day=Mon";
HttpGet httpget = new HttpGet(url);
HttpResponse response;
response = httpclient.execute(httpget);
HttpEntity entity = response.getEntity();
if (entity != null) {
Header[] allHeaders = response.getHeaders("Content-Type");
System.out.println(allHeaders[0].getValue());
}
} catch (ClientProtocolException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
[/code]


And the output of above code is : text/html.
But i think the output must be "text/html; charset=windows-1256" .Am i right?

But when i use "http://bigbrowser.blog.lemonde.fr/2011/08/03/iran-le-mossad-derriere-le-meurtre-dun-scientifique-spiegel"
as a url in code,it returns "text/html; charset=UTF-8" ,that i think ,it is OK.
It seems ,it works for some pages not all of them.Why this happens?
 

Khosro.
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message