nutch-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Claudio Martella <claudio.marte...@tis.bz.it>
Subject Re: default authentication scheme
Date Tue, 11 Jan 2011 23:14:34 GMT
Hi Susam,

thanks for your answer. This is the code:

HttpClient client = new HttpClient();
client.getParams().setAuthenticationPreemptive(false);
Credentials defaultcreds = new NTCredentials("user", "password",
"client", "host");
List authPrefs = new ArrayList();
authPrefs.add(AuthPolicy.DIGEST);
authPrefs.add(AuthPolicy.BASIC);
// This will exclude the NTLM authentication scheme
client.getParams().setParameter(AuthPolicy.AUTH_SCHEME_PRIORITY, authPrefs);
client.getState().setCredentials(AuthScope.ANY, defaultcreds);
HttpMethod method = new GetMethod("host");

You basically play with priorities.
Yes, I checked the code this afternoon and I think a solution that could
work in the actual nutch code would be to set priority explicitly.
For example if you set Digest as the default scheme, he'll try that first.

What do you think?

On 1/11/11 4:49 PM, Susam Pal wrote:
> Hi Claudio,
>
> I worked on this a long time ago. As far as I remember, the Apache
> Jakarta Commons HttpClient library would attempt NTLM authentication
> if 'NTLM' is found in the 'WWW-Authenticate' header in the HTTP
> response. It would ignore 'Digest' in that case because NTLM
> authentication scheme is believed to be more secure than Digest
> authentication scheme. If you want to confirm this behaviour, you
> could try the Jakarta Commons HttpClient mailing list:
> http://hc.apache.org/httpclient-3.x/mail-lists.html
>
> Could you please share the code where you are able to force the usage
> of Digest authentication scheme?
>
> This is the source file where the code for authentication in Nutch is
> written: http://svn.apache.org/viewvc/nutch/trunk/src/plugin/protocol-httpclient/src/java/org/apache/nutch/protocol/httpclient/Http.java?view=markup
>
> In lines 334 to 337 and 402 to 405, you can find that all credentials
> are set as NTCredentials objects. I don't have a solution for you yet.
> But I hope that this information would help you in some manner.
>
> Regards,
> Susam Pal
>
> On Tue, Jan 11, 2011 at 7:46 PM, Claudio Martella
> <claudio.martella@tis.bz.it> wrote:
>> Hi,
>>
>> I'm trying to authenticate with HTTP through DIGEST.
>>
>> I set the <default scheme="DIGEST"/> in my conf. The webserver supports
>> ntlmv2 and digest. As ntlmv2 is not supported by httpclient, i'd like to
>> force to use digest.
>> The problem is that as both ntlm and digest are negotiated from the
>> webserver, nutch still tries only ntlm.
>>
>> Here are the logs:
>> 2011-01-11 15:10:53,800 TRACE httpclient.Http - Credentials - username:
>> cm; set as default for realm: 192.168.10.210:8090; scheme: digest
>> [...]
>> 2011-01-11 14:40:27,781 DEBUG wire.header - << "Server:
>> Microsoft-IIS/6.0[\r][\n]"
>> 2011-01-11 14:40:27,781 DEBUG wire.header - << "WWW-Authenticate:
>> Negotiate[\r][\n]"
>> 2011-01-11 14:40:27,782 DEBUG wire.header - << "WWW-Authenticate:
>> NTLM[\r][\n]"
>> 2011-01-11 14:40:27,782 DEBUG wire.header - << "WWW-Authenticate: Digest
>> qop="auth",algorithm=MD5-sess,nonce="+Upgraded+v1df768737f223ab92f2de931c95b1cb01a23417361f3196f75258730e42c66de8880242a9455ccac90972955276d42451",charset=utf-8,realm="Digest"[\r][\n]"
>> [...]
>> 2011-01-11 15:10:53,885 DEBUG httpclient.HttpMethodDirector -
>> Authorization required
>> 2011-01-11 15:10:53,893 DEBUG auth.AuthChallengeProcessor - Supported
>> authentication schemes in the order of preference: [ntlm, digest, basic]
>> 2011-01-11 15:10:53,893 INFO  auth.AuthChallengeProcessor - ntlm
>> authentication scheme selected
>> 2011-01-11 15:10:53,893 DEBUG auth.AuthChallengeProcessor - Using
>> authentication scheme: ntlm
>> 2011-01-11 15:10:53,893 DEBUG auth.AuthChallengeProcessor -
>> Authorization challenge processed
>> 2011-01-11 15:10:53,893 DEBUG httpclient.HttpMethodDirector -
>> Authentication scope: NTLM <any realm>@192.168.10.210:8090
>> 2011-01-11 15:10:53,893 DEBUG httpclient.HttpMethodDirector -
>> Credentials required
>> 2011-01-11 15:10:53,893 DEBUG httpclient.HttpMethodDirector -
>> Credentials provider not available
>> 2011-01-11 15:10:53,893 INFO  httpclient.HttpMethodDirector - No
>> credentials available for NTLM <any realm>@192.168.10.210:8090
>>
>> By the way the documentation should be fixed. In order to see the
>> credentials set in the logs the log4j should be set to TRACE and not DEBUG.
>>
>> Why isn't the Digest scheme being tried at all? The credentials are set
>> and the server negotiates it. I wrote a sample httpclient application
>> that forces the usage of digest and it does authenticate.
>>
>> Any suggestion?
>>
>> Thanks
>>
>> Claudio
>>
>>
>> --
>> Claudio Martella
>> Digital Technologies
>> Unit Research & Development - Analyst
>>
>> TIS innovation park
>> Via Siemens 19 | Siemensstr. 19
>> 39100 Bolzano | 39100 Bozen
>> Tel. +39 0471 068 123
>> Fax  +39 0471 068 129
>> claudio.martella@tis.bz.it http://www.tis.bz.it
>>
>>


-- 
Claudio Martella
Digital Technologies
Unit Research & Development - Analyst

TIS innovation park
Via Siemens 19 | Siemensstr. 19
39100 Bolzano | 39100 Bozen
Tel. +39 0471 068 123
Fax  +39 0471 068 129
claudio.martella@tis.bz.it http://www.tis.bz.it

Short information regarding use of personal data. According to Section 13 of Italian Legislative
Decree no. 196 of 30 June 2003, we inform you that we process your personal data in order
to fulfil contractual and fiscal obligations and also to send you information regarding our
services and events. Your personal data are processed with and without electronic means and
by respecting data subjects' rights, fundamental freedoms and dignity, particularly with regard
to confidentiality, personal identity and the right to personal data protection. At any time
and without formalities you can write an e-mail to privacy@tis.bz.it in order to object the
processing of your personal data for the purpose of sending advertising materials and also
to exercise the right to access personal data and other rights referred to in Section 7 of
Decree 196/2003. The data controller is TIS Techno Innovation Alto Adige, Siemens Street n.
19, Bolzano. You can find the complete information on the web site www.tis.bz.it.



Mime
View raw message