Return-Path: Delivered-To: apmail-hc-httpclient-users-archive@www.apache.org Received: (qmail 96444 invoked from network); 20 May 2008 14:22:11 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 20 May 2008 14:22:11 -0000 Received: (qmail 46312 invoked by uid 500); 20 May 2008 14:22:12 -0000 Delivered-To: apmail-hc-httpclient-users-archive@hc.apache.org Received: (qmail 46125 invoked by uid 500); 20 May 2008 14:22:12 -0000 Mailing-List: contact httpclient-users-help@hc.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: "HttpClient User Discussion" Delivered-To: mailing list httpclient-users@hc.apache.org Received: (qmail 46114 invoked by uid 99); 20 May 2008 14:22:12 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 20 May 2008 07:22:12 -0700 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [194.151.67.72] (HELO barracuda2.sogeti.nl) (194.151.67.72) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 20 May 2008 14:21:16 +0000 X-ASG-Debug-ID: 1211293280-77de02760000-tPwLzd X-Barracuda-URL: http://10.17.2.12:8000/cgi-bin/mark.cgi Received: from slidore.SGTI.NL (localhost [127.0.0.1]) by barracuda2.sogeti.nl (Spam Firewall) with ESMTP id 1EF5F20930D for ; Tue, 20 May 2008 16:21:20 +0200 (CEST) Received: from slidore.SGTI.NL ([10.17.2.32]) by barracuda2.sogeti.nl with ESMTP id 8VkuwskcZ2x6PDDi for ; Tue, 20 May 2008 16:21:20 +0200 (CEST) X-ASG-Whitelist: Client Received: from SOGMAIL1.SGTI.NL ([172.30.16.47]) by slidore.SGTI.NL with Microsoft SMTPSVC(6.0.3790.1830); Tue, 20 May 2008 16:21:37 +0200 X-MimeOLE: Produced By Microsoft Exchange V6.5 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----_=_NextPart_001_01C8BA84.D0BCE6E6" X-ASG-Orig-Subj: RE: HttpClient returns different response compared to browser Subject: RE: HttpClient returns different response compared to browser Date: Tue, 20 May 2008 16:20:16 +0200 Message-ID: <08264949C8B06F40BFDE50E6B1DC4F8003D76981@SOGMAIL1.SGTI.NL> X-MS-Has-Attach: X-MS-TNEF-Correlator: <08264949C8B06F40BFDE50E6B1DC4F8003D76981@SOGMAIL1.SGTI.NL> Thread-Topic: HttpClient returns different response compared to browser Thread-Index: Aci6fLOlVrkVsn3DTfOoqRSSElTXoQAB+yZa References: <08264949C8B06F40BFDE50E6B1DC4F8003D76980@SOGMAIL1.SGTI.NL> <25aac9fc0805200622w1d696021t2d777482e201531d@mail.gmail.com> From: "Kwik, Micky" To: "HttpClient User Discussion" X-OriginalArrivalTime: 20 May 2008 14:21:37.0672 (UTC) FILETIME=[D0827880:01C8BA84] X-Barracuda-Connect: UNKNOWN[10.17.2.32] X-Barracuda-Start-Time: 1211293280 X-Barracuda-Virus-Scanned: by Barracuda Spam Firewall at sogeti.nl X-Virus-Checked: Checked by ClamAV on apache.org ------_=_NextPart_001_01C8BA84.D0BCE6E6 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Hey Sebb, =20 Thnx alot. I used a HTTP sniffer and found out that the requests from my cl= ient were not correct. Problem is solved now :) =20 Micky ________________________________ Van: sebb [mailto:sebbaz@gmail.com] Verzonden: di 20-5-2008 15:22 Aan: HttpClient User Discussion Onderwerp: Re: HttpClient returns different response compared to browser On 20/05/2008, Kwik, Micky wrote: > Hi, > > I wrote a simple client to fetch documents from some websites. But I fou= nd that the HttpClient often gets a different response compared to the brow= ser even if the HTTP status code is 200. In what way is the response different? > For example this URL: http://www.elsevierfiscaal.nl/els/enc/productservic= eoverzicht/id1101-31813/search/true/channelId/1101/update-14-aangifte-assis= tent-2008.html or http://ww= w.belastingdienst.nl/zakelijk/nieuwsbrief/nieuwsberichten/2008-04-02-08_fra= nke.html > > Here is my code snippet: > HttpClient client =3D setUpClient(aUrl); > GetMethod method =3D new GetMethod(); > method.getParams().setParameter("http.useragent", "Mozilla/5.0 (Windows= ; U; Windows NT 5.1; en-US; rv:1.8.1.14) Gecko/20080404 Firefox/2.0.0.14"); > method.getParams().setBooleanParameter("http.protocol.single-cookie-hea= der", true); > method.setFollowRedirects(false); > method.getParams().setCookiePolicy(CookiePolicy.BROWSER_COMPATIBILITY); > > int returnCode =3D client.executeMethod(method); > if (returnCode =3D=3D HttpStatus.SC_OK) { > fetchBody(aUrl, method); > } > I already varied Cookie policies and followRedirects but with no success= . Is there a way of solving this or is it a case of "http client is not a b= rowser" ? So long as HttpClient is set up to send the correct HTTP requests, the server cannot tell if it is talking to a browser or not. [Well, I suppose it could do some clever tricks with timing or Javascript. But that is unlikely to be the case here.] So you just need to find out what the difference is between what the browser sends and what HttpClient is sending. There may be some extra hidden fields or other parameters that have been overlooked. A protocol sniffer such as Wireshark - or a recording proxy - would be helpful here. > Kind regards, > Micky > > Disclaimer: > This message contains information that may be privileged or confidential= and is the property of Sogeti Nederland B.V. or its Group members. It is i= ntended only for the person to whom it is addressed. If you are not the int= ended recipient, you are not authorized to read, print, retain, copy, disse= minate, distribute, or use this message or any part thereof. If you receive= this message in error, please notify the sender immediately and delete all= copies of this message. > --------------------------------------------------------------------- To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org For additional commands, e-mail: httpclient-users-help@hc.apache.org Disclaimer:=0D This message contains information that may be privileged or confidential an= d is the property of Sogeti Nederland B.V. or its Group members. It is inte= nded only for the person to whom it is addressed. If you are not the intend= ed recipient, you are not authorized to read, print, retain, copy, dissemin= ate, distribute, or use this message or any part thereof. If you receive th= is message in error, please notify the sender immediately and delete all co= pies of this message. ------_=_NextPart_001_01C8BA84.D0BCE6E6 Content-Type: text/plain; charset=us-ascii --------------------------------------------------------------------- To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org For additional commands, e-mail: httpclient-users-help@hc.apache.org ------_=_NextPart_001_01C8BA84.D0BCE6E6--