hc-httpclient-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Avi Hayun <avrah...@gmail.com>
Subject Re: When should I use the ClientAuthentication ?
Date Tue, 02 Dec 2014 10:02:47 GMT
Thank you very much.


It works.



I appreciate your help.
Avi.

On Mon, Dec 1, 2014 at 6:00 PM, Ander Juaristi Alamos <ajuaristi@gmx.es>
wrote:

> I've never used it, but it seems to be an implementation of the HTTP
> Basic/Digest Authentication, defined in RFC 2617:
> http://www.ietf.org/rfc/rfc2617.txt. Please someone correct me if I'm
> wrong.
>
> If your crawler hits a site that requires user authentication, it won't be
> able to scrap anything but the entity body sent along the 401 response,
> which usually isn't very meaningful. You must know the user/password
> credentials of every site your crawler visits in order to get the actual
> content.
>
> If you want to give it a try, you can set up basic HTTP authentication
> with PHP. Here are a couple of links:
>
> http://stackoverflow.com/questions/4150507/how-can-i-use-basic-http-authentication-in-php
> http://php.net/manual/en/features.http-auth.php
>
> Regards,
>
> - AJ
>
>
>
> Enviar: martes 25 de noviembre de 2014 a las 15:27
> De: "Avi Hayun" <avraham2@gmail.com>
> Para: "HttpClient User Discussion" <httpclient-users@hc.apache.org>
> Asunto: When should I use the ClientAuthentication ?
> I am maintaining a Web Crawler.
>
>
> I want to integrate crawling of sites which have username/password zones.
>
>
> I successfully integrated FORM based authentication.
>
>
> I want to integrate also the ClientAuthentication I can see here:
>
> https://hc.apache.org/httpcomponents-client-ga/httpclient/examples/org/apache/http/examples/client/ClientAuthentication.java
>
>
> But, in order to check it out I need a scenario - a site with a zone
> protected by this type of authentication.
>
>
> Can anybody supply me with an example where I can use this
> ClientAuthentication in order to crawl ?
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
> For additional commands, e-mail: httpclient-users-help@hc.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message