hc-httpclient-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Uncle <unclelongha...@gmail.com>
Subject Trying to follow 301 redirects results in 404 error
Date Sat, 24 Mar 2012 12:50:48 GMT
Apologies if this has been addressed, I searched the archives and was unable to find anything
directly relating to this, though it seems straightforward.

I am trying to use httpclient to obtain the redirect URL for a url such as http://bit.ly/GGviSv,
but I am getting a 404 error.  This is a "permanent" redirect (code 301).  This code:

        String url = "http://bit.ly/GGviSv";
        HttpGet httpget = new HttpGet(url);
        HttpContext context = new BasicHttpContext();
        HttpClient httpclient = new DefaultHttpClient();

        HttpResponse response = httpclient.execute(httpget, context);

        RedirectStrategy redirectStrategy = new DefaultRedirectStrategy();

        log.info("isRedirected = " + redirectStrategy.isRedirected(httpget, response, context));
        for(Header header : response.getAllHeaders())
            log.info("header: " + header);

        log.info("status = " + response.getStatusLine());

outputs:

isRedirected = false
header: Server: nginx
header: Date: Sat, 24 Mar 2012 12:38:43 GMT
header: Content-Type: text/html; charset=UTF-8                                           
                                                                              
header: Transfer-Encoding: chunked
header: Connection: keep-alive
header: Vary: Cookie
header: X-CF-Powered-By: WP 1.2.0
header: X-Pingback: http://lavamagazine.com/xmlrpc.php
header: Expires: Wed, 11 Jan 1984 05:00:00 GMT
header: Last-Modified: Sat, 24 Mar 2012 12:38:43 GMT
header: Cache-Control: no-cache, must-revalidate, max-age=0
header: Pragma: no-cache
status = HTTP/1.1 404 Not Found

I expected 1) isRedirected to be true, 2) the response code to be 301, and/or 3) the destination
URL to be in the headers where I could get it.  However, if I ignore the 404 and continue
getting the URL:

        HttpUriRequest currentReq = (HttpUriRequest) context.getAttribute( ExecutionContext.HTTP_REQUEST
);
        HttpHost currentHost = (HttpHost)  context.getAttribute(ExecutionContext.HTTP_TARGET_HOST);
        String currentUrl = (currentReq.getURI().isAbsolute()) ? currentReq.getURI().toString()
: (currentHost.toURI() + currentReq.getURI());
        httpclient.getConnectionManager().shutdown();
        log.info("Redirected URL = " + currentUrl);

This does the right thing and provides me with the correct URL.  So, why the 404 error?  I
am processing a large quantity of URL's and need to accurately determine which ones are errors,
redirects, etc.

Thanks for any assistance.

Randy


---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
For additional commands, e-mail: httpclient-users-help@hc.apache.org


Mime
View raw message