hc-httpclient-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ander Juaristi Alamos" <ajuari...@gmx.es>
Subject Re: When should I use the ClientAuthentication ?
Date Mon, 01 Dec 2014 16:00:16 GMT
I've never used it, but it seems to be an implementation of the HTTP Basic/Digest Authentication,
defined in RFC 2617: http://www.ietf.org/rfc/rfc2617.txt. Please someone correct me if I'm
wrong.

If your crawler hits a site that requires user authentication, it won't be able to scrap anything
but the entity body sent along the 401 response, which usually isn't very meaningful. You
must know the user/password credentials of every site your crawler visits in order to get
the actual content.

If you want to give it a try, you can set up basic HTTP authentication with PHP. Here are
a couple of links:
http://stackoverflow.com/questions/4150507/how-can-i-use-basic-http-authentication-in-php
http://php.net/manual/en/features.http-auth.php
 
Regards,

- AJ
 
 

Enviar: martes 25 de noviembre de 2014 a las 15:27
De: "Avi Hayun" <avraham2@gmail.com>
Para: "HttpClient User Discussion" <httpclient-users@hc.apache.org>
Asunto: When should I use the ClientAuthentication ?
I am maintaining a Web Crawler.


I want to integrate crawling of sites which have username/password zones.


I successfully integrated FORM based authentication.


I want to integrate also the ClientAuthentication I can see here:
https://hc.apache.org/httpcomponents-client-ga/httpclient/examples/org/apache/http/examples/client/ClientAuthentication.java


But, in order to check it out I need a scenario - a site with a zone
protected by this type of authentication.


Can anybody supply me with an example where I can use this
ClientAuthentication in order to crawl ?

---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
For additional commands, e-mail: httpclient-users-help@hc.apache.org


Mime
View raw message