nutch-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Susam Pal <susam....@gmail.com>
Subject Re: Crawling authenticated websites !
Date Thu, 18 Mar 2010 16:25:31 GMT
On Thu, Mar 18, 2010 at 7:27 PM, Ranganath Cuddapah
<ranganath.c@gmail.com> wrote:
> Hello,
> Is there a way to configure Nutch to crawl "forms authenticated" websites?
> What I mean is the kind of websites which look up a database for
> authentication/authorization and does not allow you to view secure pages
> unless authenticated. This need not be specifically on https, but on http
> too..!
> Any help is greatly appreciated.
> Thanks,
> Ranganath
> P.S : Not sure if this is the right email to ask the question. Apologies, in
> advance.

nutch-user@lucene.apache.org is the right place ask this. I've
included it in CC.

This feature is not present in Nutch. We have recorded the summary of
some old discussions regarding this here:
http://wiki.apache.org/nutch/HttpPostAuthentication But this was never
implemented.

Regards,
Susam Pal

Mime
View raw message