hc-httpclient-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stephen J. Butler" <stephen.but...@gmail.com>
Subject Re: why are so different the response headers as reported by wget and httpclient?
Date Tue, 07 May 2013 15:10:56 GMT
On Tue, May 7, 2013 at 3:36 AM, Albretch Mueller <lbrtchx@gmail.com> wrote:

> > It's passing the wrong 'Host' line. That's why you are getting a 404.
>
>  well, the code section in which I set the host I pasted bellow (
> notice how I set the Host as part of the Request Headers  {"Host",
> (httpGet.getURI()).getHost()} ):


Yes, but that's not correct after the redirect. First time it connects the
'Host' line should be 'download.ted.com'. But the second time, after
following the redirect, it needs to be 'video.ted.com'.


> > Are you using standard code to handle the redirect, or writing your own?
>
>  I will have to do the redirect myself and I would like to handle/get
> all the response headers of every redirect. How do you do that? Could
> you point me to some basic redirect code example?
>
> ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~
>   HttpGet httpGet = new HttpGet(aGetURL);
>   String aRqLn = httpGet.getRequestLine().toString();   // Request Line
> System.err.println("// __ httpGet.getRequestLine(): |" + aRqLn + "|");
>
> ...
> // __ parsing host from URL
>   String aHost = (httpGet.getURI()).getHost();
> System.err.println("// __ aHost: |" + aHost + "|");
>
>   String[][] aRqHdrs = new String[][]{
>      {"Host", aHost}
>    , {"Connection", "keep-alive"}
>    , {"User-Agent", "Mozilla/5.0 (X11; Linux i686; rv:10.0.4)
> Gecko/20100101 Firefox/10.0.4 Iceweasel/10.0.4"}
>    , {"Accept", "text/html, text/*;q=0.9, image/jpeg;q=0.9,
> image/png;q=0.9, image/*;q=0.9, */*;q=0.8"}
>    , {"Accept-Encoding", "gzip, deflate, x-gzip, x-deflate"}
>    , {"Accept-Charset", "utf-8,*;q=0.5"}
>    , {"Accept-Language", "en-US,en;q=0.9"}
>   };
>

I see, you're setting the request header manually! That's why HttpClient is
failing to properly set the Host header.

I'd suggest you not set the Host header manually and let HttpClient do it.
It will add it itself; that's mandated by the HTTP 1.1 spec.

If you insist on handling the redirect manually, set
the ClientPNames.HANDLE_REDIRECTS parameter to Boolean.FALSE. Then the
return from executing your request will be the raw redirect, and you'll
have to catch that case and resubmit the request on the proper URL.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message