hc-httpclient-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeff Davis <j...@FlyingDiamond.com>
Subject Re: I am having issues logging into a website and retrieving the source of the page after login
Date Sat, 13 Feb 2010 23:13:08 GMT
Hi Robert,

The usual practice for this kind of task is to use a packet sniffer and 
compare the actual data that is transmitted across the wire when you use 
a browser and the same when your application runs, then see what is 
different.  Wireshark (www.*wireshark*.org) works well for this if the 
login page is not ssl, the firebug plugin for firefox gives some of this 
information, and there are other great products as well.

Generally many 3rd party sites use various tricks to thwart automated 
logins so you might have to follow some redirects, parse some .js, 
etc...  In short you have to make your java application behave exactly 
like a browser would, and the best way is to watch exactly what the 
browser does.

If you come up with a specific need, with cookies, or setting header 
settings and need help implementing that with http-client you will get 
plenty of help.

I didn't get the attached zip so I couldn't really look into anything 
further, and even then your best bet is to seek out the information you 
need using the tips above.

Best of luck,

Robert Stone wrote:
> Everyone,
> I am trying to log into a website and retrieve the source of the page 
> after the login page. I am using the latest version of http-client 
> 4.0.1 and I am having trouble getting past the login page. I have 
> checked several forums and all the posts I have found seem to 
> reference older versions of http-client which don't use the same 
> classes as 4.0.1. I have attached a zip file that contains a copy of 
> the login form and my Java class that contains the method that I am 
> trying to use to get the page source. For some reason I can't seem to 
> login. No matter what I do it takes me right back to the login page. 
> Any help will be greatly appreciated. Also please note that I don't 
> have control over the login form or any of the web content as it is on 
> a third party site that I am trying to pull some data off of once I 
> get logged in.
> Thanks, 
> Robert
> ------------------------------------------------------------------------
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
> For additional commands, e-mail: httpclient-users-help@hc.apache.org

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message