Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 5A1D8200B3B for ; Mon, 11 Jul 2016 20:40:39 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 58A7C160A78; Mon, 11 Jul 2016 18:40:39 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 2E742160A5E for ; Mon, 11 Jul 2016 20:40:38 +0200 (CEST) Received: (qmail 34015 invoked by uid 500); 11 Jul 2016 18:40:32 -0000 Mailing-List: contact user-help@manifoldcf.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@manifoldcf.apache.org Delivered-To: mailing list user@manifoldcf.apache.org Received: (qmail 34003 invoked by uid 99); 11 Jul 2016 18:40:32 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 11 Jul 2016 18:40:32 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id A7E64180B95 for ; Mon, 11 Jul 2016 18:40:31 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -0.821 X-Spam-Level: X-Spam-Status: No, score=-0.821 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd3-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx2-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id fGlzOlIrbkxm for ; Mon, 11 Jul 2016 18:40:28 +0000 (UTC) Received: from mail-oi0-f53.google.com (mail-oi0-f53.google.com [209.85.218.53]) by mx2-lw-eu.apache.org (ASF Mail Server at mx2-lw-eu.apache.org) with ESMTPS id C1CA15F19A for ; Mon, 11 Jul 2016 18:40:27 +0000 (UTC) Received: by mail-oi0-f53.google.com with SMTP id s66so160610132oif.1 for ; Mon, 11 Jul 2016 11:40:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=z4xFB4ikVtefAIg8JaW/ujxF+TH0atyDeL1fy7hPx7A=; b=rqTwhELoF+tvIszPUqsJkVqHSDFyfESi7ESg+XUY/lmBMAqeDDWoZtWXupcI5eM905 +jMPAytqZOWoNN3OLapuu389VFRM8JTrIzEYQnNQY6N5tw9/FxY4OuxKw5m3kM9XBEHy peRGVh9wwk3afG5d6AdaBzVCpFiBY83GAJQpb8bvNrDxGfYlpRqo//2cXoxhitamMXZm OZxYVclbQu9Twz0nd5egO8hXzJwtoh9VcURhxuSRngD1ckxYvWSWGuVaK71gWtLC7Mrh FOBILhsF3O0pstM9dXybtNQoZyRctyoxnjDLxFwIILzsdhJZc5/P2ty78Nb6IOREQU2y KYPQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=z4xFB4ikVtefAIg8JaW/ujxF+TH0atyDeL1fy7hPx7A=; b=PeuSvW7N/r1B3pAgGU+E7K2NvuyBGqk+bWLoBCEv6BgA0+t6xFrC7BxBQ9v9XQpp/O 3ANgGZjDnn5Q4QfOLdAx4OvBPDlqIXA+WKFbyASRGcldgua8i7re9LdfWu48MZXWVlGU oY/udan+zlZd4MNq1N/dwacXSxf2XWX5ps4MtO4oiDS6kLAo3Cbmwwa8I1aNIfrJBsiZ 3YOdJczYM5JzSROy794rDPJ/xWRmBA8BGy8pGoEwzPYxkxUt5Q/Zh6zB9akx7buQW0sj 9YCzG+Kagxxoa3nkwyc2LuRSNwpAuGoRCCmTVIRWdqlmYuTP/0lg7rGLU7j1F/8u0CuA qfOA== X-Gm-Message-State: ALyK8tL5kOKa+1rZhlJHP/hUz5WYcge2LqudUwUvYfdCasg9xQea37I85FKOaVCpEGdPmV2laZjQqD7bVRaINQ== X-Received: by 10.157.11.163 with SMTP id 32mr3859745oth.119.1468262425972; Mon, 11 Jul 2016 11:40:25 -0700 (PDT) MIME-Version: 1.0 Received: by 10.182.117.132 with HTTP; Mon, 11 Jul 2016 11:40:25 -0700 (PDT) In-Reply-To: References: From: jetnet Date: Mon, 11 Jul 2016 20:40:25 +0200 Message-ID: Subject: Re: Session-based authentication To: user@manifoldcf.apache.org Content-Type: text/plain; charset=UTF-8 archived-at: Mon, 11 Jul 2016 18:40:39 -0000 Hi Karl, it seems that setting the attribute ClientCookie.DOMAIN_ATTR is only required, when the cookie's domain and the webserver's address are NOT equal. In my case the are the same: wikisite The cookie spec - setCookieSpec(CookieSpecs.STANDARD), you use for the http client, seems to be the right one (it uses RFC6265LaxSpec). So, I have to debug the httpclient to find out, why the cookies do not match here. I'll open a jira ticket for that anyway. Thanks! 2016-07-07 15:17 GMT+02:00 Karl Wright : > Can you create a ticket and submit a patch? > There is already code in place that overrides HttpClient's default cookie > policy. I'm surprised this case is not already covered. > > Thanks! > Karl > > > On Thu, Jul 7, 2016 at 8:49 AM, jetnet wrote: >> >> hi Karl, >> >> the problem was the host name in the seeding URL, not the FQDN. So, >> the default cookie policy does woks with FQDNs only. >> That's why the obtained cookies were never used for the further requests. >> Changing the seeding URL to the "full host name" format solved the >> problem. >> >> jeeeez, that was a weird one... >> >> How about adding the next line to the code? >> >> cookie.setAttribute(ClientCookie.DOMAIN_ATTR, "true"); >> >> Thanks! >> Konstantin >> >> 2016-07-07 13:24 GMT+02:00 Karl Wright : >> > Hi Konstantin, >> > >> > The mock site that the test crawls and logs into is generated by >> > MockSessionWebService.java, under >> > >> > connectors/webcrawler/connector/src/test/java/org/apache/manifoldcf/crawler/connectors/webcrawler/tests. >> > It does almost precisely what your site is doing. The test itself is >> > SessionTester.java. Your setup should be similar to how the test sets >> > up >> > the login sequence and protected content area. >> > >> > Thanks, >> > Karl >> > >> > >> > On Thu, Jul 7, 2016 at 7:17 AM, Karl Wright wrote: >> >> >> >> Hi Konstantin, >> >> >> >> There is an advanced Web Connector integration test, which currently >> >> passes, that tests session login and cookie transmission. I'll look >> >> over >> >> the test to be sure it is complete, but if so you should really be >> >> looking >> >> at your login sequence and verifying that the cookie set takes place in >> >> a >> >> request that is part of the login sequence. >> >> >> >> Thanks, >> >> Karl >> >> >> >> >> >> On Thu, Jul 7, 2016 at 6:58 AM, jetnet wrote: >> >>> >> >>> Thanks for the hint regarding the httpclient logging! >> >>> So, it turned out, the cookies do NOT get added to the request: >> >>> >> >>> DEBUG 2016-07-07 12:49:26,015 (Worker thread '4') - WEB: Get method >> >>> for '/sitemap.xml' >> >>> DEBUG 2016-07-07 12:49:26,015 (Worker thread '4') - WEB: Adding 2 >> >>> cookies for '/sitemap.xml' >> >>> DEBUG 2016-07-07 12:49:26,015 (Worker thread '4') - WEB: Cookie >> >>> '[version: 0][name: PHPSESSID][value: >> >>> 8jegbs2dqb6r9oc3mb4pt0q777][domain: wikisite][path: /][expiry: null]' >> >>> added >> >>> DEBUG 2016-07-07 12:49:26,015 (Worker thread '4') - WEB: Cookie >> >>> '[version: 0][name: authtoken][value: >> >>> 920_636034784213249598_d2f40072be60b4de7bee72d74fc04400][domain: >> >>> wikisite][path: /][expiry: Thu Jul 14 10:53:41 CEST 2016]' added >> >>> DEBUG 2016-07-07 12:49:26,030 (Thread-1214) - CookieSpec selected: >> >>> standard >> >>> DEBUG 2016-07-07 12:49:26,093 (Thread-1214) - Auth cache not set in >> >>> the >> >>> context >> >>> DEBUG 2016-07-07 12:49:26,093 (Thread-1214) - Connection request: >> >>> [route: {}->http://wikisite:80][total kept alive: 0; route allocated: >> >>> 0 of 1; total allocated: 0 of 20] >> >>> DEBUG 2016-07-07 12:49:26,140 (Thread-1214) - Connection leased: [id: >> >>> 0][route: {}->http://wikisite:80][total kept alive: 0; route >> >>> allocated: 1 of 1; total allocated: 1 of 20] >> >>> DEBUG 2016-07-07 12:49:26,140 (Thread-1214) - Opening connection >> >>> {}->http://wikisite:80 >> >>> DEBUG 2016-07-07 12:49:26,155 (Thread-1214) - Connecting to >> >>> wikisite/10.0.0.100:80 >> >>> DEBUG 2016-07-07 12:49:26,155 (Thread-1214) - Connection established >> >>> 10.0.0.184:58501<->10.0.0.100:80 >> >>> DEBUG 2016-07-07 12:49:26,155 (Thread-1214) - http-outgoing-0: set >> >>> socket timeout to 300000 >> >>> DEBUG 2016-07-07 12:49:26,155 (Thread-1214) - Executing request GET >> >>> /sitemap.xml HTTP/1.1 >> >>> DEBUG 2016-07-07 12:49:26,155 (Thread-1214) - Target auth state: >> >>> UNCHALLENGED >> >>> DEBUG 2016-07-07 12:49:26,155 (Thread-1214) - Proxy auth state: >> >>> UNCHALLENGED >> >>> DEBUG 2016-07-07 12:49:26,155 (Thread-1214) - http-outgoing-0 >> GET >> >>> /sitemap.xml HTTP/1.1 >> >>> DEBUG 2016-07-07 12:49:26,155 (Thread-1214) - http-outgoing-0 >> >> >>> User-Agent: Mozilla/5.0 (ApacheManifoldCFWebCrawler; >> >>> email@wikisite.com) >> >>> DEBUG 2016-07-07 12:49:26,155 (Thread-1214) - http-outgoing-0 >> From: >> >>> email@wikisite.com >> >>> DEBUG 2016-07-07 12:49:26,155 (Thread-1214) - http-outgoing-0 >> >> >>> Accept: >> >>> */* >> >>> DEBUG 2016-07-07 12:49:26,155 (Thread-1214) - http-outgoing-0 >> >> >>> Accept-Encoding: gzip,deflate >> >>> DEBUG 2016-07-07 12:49:26,155 (Thread-1214) - http-outgoing-0 >> Host: >> >>> wikisite >> >>> DEBUG 2016-07-07 12:49:26,155 (Thread-1214) - http-outgoing-0 >> >> >>> Connection: Keep-Alive >> >>> DEBUG 2016-07-07 12:49:37,768 (Thread-1214) - http-outgoing-0 << >> >>> HTTP/1.1 >> >>> 200 OK >> >>> DEBUG 2016-07-07 12:49:37,768 (Thread-1214) - http-outgoing-0 << >> >>> Content-Type: application/xml; charset=utf-8 >> >>> DEBUG 2016-07-07 12:49:37,768 (Thread-1214) - http-outgoing-0 << >> >>> Server: Microsoft-IIS/7.5 >> >>> DEBUG 2016-07-07 12:49:37,768 (Thread-1214) - http-outgoing-0 << >> >>> X-Powered-By: PHP/5.2.14 >> >>> DEBUG 2016-07-07 12:49:37,768 (Thread-1214) - http-outgoing-0 << >> >>> Set-Cookie: PHPSESSID=bk9487elppchvshc38c7pfnv01; path=/; HttpOnly >> >>> DEBUG 2016-07-07 12:49:37,768 (Thread-1214) - http-outgoing-0 << >> >>> X-Powered-By: ASP.NET >> >>> DEBUG 2016-07-07 12:49:37,768 (Thread-1214) - http-outgoing-0 << Date: >> >>> Thu, 07 Jul 2016 10:49:38 GMT >> >>> DEBUG 2016-07-07 12:49:37,768 (Thread-1214) - http-outgoing-0 << >> >>> Content-Length: 684207 >> >>> >> >>> >> >>> Jira tiket? :) >> >>> >> >>> Thanks, >> >>> Konstantin >> >>> >> >>> >> >>> 2016-07-07 12:37 GMT+02:00 Karl Wright : >> >>> > It really does add cookies as stated. >> >>> > >> >>> > That doesn't mean, however, that the cookies being sent correspond >> >>> > to a >> >>> > session that is correctly logged in. There's no way to tell this >> >>> > from >> >>> > the >> >>> > logs. >> >>> > >> >>> > You can possibly get more information about the back-and-forth by >> >>> > enabling >> >>> > httpcomponents/httpclient wire logging. Headers only should be >> >>> > sufficient. >> >>> > You should see the exact cookies and be able to verify that the >> >>> > cookies >> >>> > sent >> >>> > are the ones that were returned. You still won't be able to tell if >> >>> > the >> >>> > login was successful or not. >> >>> > >> >>> > Karl >> >>> > >> >>> > >> >>> > >> >>> > On Thu, Jul 7, 2016 at 6:25 AM, jetnet wrote: >> >>> >> >> >>> >> ok, so, it means, that I do not need the 3rd stage at all? As the >> >>> >> second stage (form authentication) records the cookies and >> >>> >> redirects >> >>> >> back: >> >>> >> >> >>> >> the second stage: >> >>> >> >> >>> >> DEBUG 2016-07-07 10:52:48,231 (Worker thread '79') - WEB: Post >> >>> >> method >> >>> >> for '/Special:UserLogin' >> >>> >> DEBUG 2016-07-07 10:52:48,231 (Worker thread '79') - WEB: Post >> >>> >> parameter name 'username' value 'someuser' for '/Special:UserLogin' >> >>> >> DEBUG 2016-07-07 10:52:48,231 (Worker thread '79') - WEB: Post >> >>> >> parameter name 'returntourl' value 'http://wikisite/sitemap.xml' >> >>> >> for >> >>> >> '/Special:UserLogin' >> >>> >> DEBUG 2016-07-07 10:52:48,231 (Worker thread '79') - WEB: Post >> >>> >> parameter name 'password' value 'XXXXXX' for '/Special:UserLogin' >> >>> >> DEBUG 2016-07-07 10:52:48,231 (Worker thread '79') - WEB: Adding 2 >> >>> >> cookies for '/Special:UserLogin' >> >>> >> DEBUG 2016-07-07 10:52:48,231 (Worker thread '79') - WEB: Cookie >> >>> >> '[version: 0][name: PHPSESSID][value: >> >>> >> bughgf8fbjkkevk79ot4ef2vj1][domain: wikisite][path: /][expiry: >> >>> >> null]' >> >>> >> added >> >>> >> DEBUG 2016-07-07 10:52:48,231 (Worker thread '79') - WEB: Cookie >> >>> >> '[version: 0][name: authtoken][value: >> >>> >> 920_636034352097041592_136c71f2ac1fc2dd1ba72de805fcd1b5][domain: >> >>> >> wikisite][path: /][expiry: Wed Jul 13 22:53:29 CEST 2016]' added >> >>> >> DEBUG 2016-07-07 10:52:48,434 (Worker thread '79') - WEB: >> >>> >> Retrieving >> >>> >> cookies... >> >>> >> DEBUG 2016-07-07 10:52:48,434 (Worker thread '79') - WEB: Cookie >> >>> >> '[version: 0][name: PHPSESSID][value: >> >>> >> 589h3f20tjndhkc391nu5u0u51][domain: wikisite][path: /][expiry: >> >>> >> null]' >> >>> >> DEBUG 2016-07-07 10:52:48,434 (Worker thread '79') - WEB: Cookie >> >>> >> '[version: 0][name: authtoken][value: >> >>> >> 920_636034783686256706_585415102d050458acfd91a9d1f223d5][domain: >> >>> >> wikisite][path: /][expiry: Thu Jul 14 10:52:48 CEST 2016]' >> >>> >> INFO 2016-07-07 10:52:48,449 (Worker thread '79') - WEB: FETCH >> >>> >> LOGIN|http://wikisite/Special:UserLogin|1467881568231+218|302|153| >> >>> >> DEBUG 2016-07-07 10:52:48,449 (Worker thread '79') - WEB: Document >> >>> >> 'http://wikisite/Special:UserLogin' did not match expected form, >> >>> >> link, >> >>> >> redirection, or content for sequence 'wikisite' >> >>> >> >> >>> >> so, the last message means, nothing matches in the sequence anymore >> >>> >> - >> >>> >> logon end. >> >>> >> And the last two cookies are being used for the next fetch of the >> >>> >> sitemap, but the its content still matches the public pattern. >> >>> >> >> >>> >> Strange things happen... I just tried to use the authtoken cookie >> >>> >> from >> >>> >> the log direct in the browser - and it gets authenticated without >> >>> >> problems: I get the "private" content. But the manifoldcf not... >> >>> >> weird... >> >>> >> >> >>> >> DEBUG 2016-07-07 10:52:48,543 (Worker thread '79') - WEB: Adding 2 >> >>> >> cookies for '/sitemap.xml' >> >>> >> DEBUG 2016-07-07 10:52:48,543 (Worker thread '79') - WEB: Cookie >> >>> >> '[version: 0][name: PHPSESSID][value: >> >>> >> 589h3f20tjndhkc391nu5u0u51][domain: wikisite][path: /][expiry: >> >>> >> null]' >> >>> >> added >> >>> >> DEBUG 2016-07-07 10:52:48,543 (Worker thread '79') - WEB: Cookie >> >>> >> '[version: 0][name: authtoken][value: >> >>> >> 920_636034783686256706_585415102d050458acfd91a9d1f223d5][domain: >> >>> >> wikisite][path: /][expiry: Thu Jul 14 10:52:48 CEST 2016]' added >> >>> >> INFO 2016-07-07 10:52:58,500 (Worker thread '79') - WEB: FETCH >> >>> >> URL|http://wikisite/sitemap.xml|1467881568543+9957|200|684072| >> >>> >> >> >>> >> size: 684072 - is public content. >> >>> >> >> >>> >> Does it **really** add the cookies to the request? :) >> >>> >> >> >>> >> Thanks! >> >>> >> Konstantin >> >>> >> >> >>> >> 2016-07-07 11:44 GMT+02:00 Karl Wright : >> >>> >> > "I thought, when the auth sequence is done >> >>> >> > (exit login mode), the redirect to the original page happens >> >>> >> > automatically (which is the case here, but somehow the content is >> >>> >> > still "public")." >> >>> >> > >> >>> >> > That is correct BUT if the final redirection is what sets the >> >>> >> > cookies >> >>> >> > THEN >> >>> >> > the cookies will only be recorded by the web connector if the >> >>> >> > final >> >>> >> > redirection is part of the login sequence. >> >>> >> > >> >>> >> > Thanks, >> >>> >> > Karl >> >>> >> > >> >>> >> > >> >>> >> > On Thu, Jul 7, 2016 at 5:33 AM, jetnet wrote: >> >>> >> >> >> >>> >> >> hi Karl, >> >>> >> >> thank you for the very prompt feedback! >> >>> >> >> >> >>> >> >> > 1) Have you made sure to include the redirection back to the >> >>> >> >> > content? >> >>> >> >> This is the step I don't quite understand - could you please >> >>> >> >> clarify >> >>> >> >> how that could be done? I thought, when the auth sequence is >> >>> >> >> done >> >>> >> >> (exit login mode), the redirect to the original page happens >> >>> >> >> automatically (which is the case here, but somehow the content >> >>> >> >> is >> >>> >> >> still "public"). >> >>> >> >> >> >>> >> >> > 2) your check for *entering* the login sequence is too broad >> >>> >> >> > and >> >>> >> >> > fires >> >>> >> >> > again even though the private sitemap page is being returned. >> >>> >> >> totally agree, that's why the first step is to look into the >> >>> >> >> content >> >>> >> >> of the page, to check, if there is a pattern which appears in >> >>> >> >> the >> >>> >> >> public version ONLY. >> >>> >> >> This is the only solution I can imagine so far, but any ideas - >> >>> >> >> very >> >>> >> >> welcome! >> >>> >> >> >> >>> >> >> The simple history shows basically the same - the process never >> >>> >> >> leaves >> >>> >> >> the login stage. >> >>> >> >> >> >>> >> >> If I remove the 3rd step, then I see, that the login stage is >> >>> >> >> over >> >>> >> >> (logon end), but as the content of the sitemap.xml is still >> >>> >> >> "public", >> >>> >> >> the login process kicks in again. >> >>> >> >> >> >>> >> >> Thanks! >> >>> >> >> Konstantin >> >>> >> >> >> >>> >> >> 2016-07-07 11:07 GMT+02:00 Karl Wright : >> >>> >> >> > Hi Konstantin, >> >>> >> >> > >> >>> >> >> > There are two possibilities: >> >>> >> >> > >> >>> >> >> > (1) You have missed one stage when specifying the login >> >>> >> >> > sequence. >> >>> >> >> > The >> >>> >> >> > cookies are getting set, but not during a step that's part of >> >>> >> >> > the >> >>> >> >> > login >> >>> >> >> > sequence. Have you made sure to include the redirection back >> >>> >> >> > to >> >>> >> >> > the >> >>> >> >> > content? >> >>> >> >> > (2) You really are logging in but your check for *entering* >> >>> >> >> > the >> >>> >> >> > login >> >>> >> >> > sequence is too broad and fires again even though the private >> >>> >> >> > sitemap >> >>> >> >> > page >> >>> >> >> > is being returned. >> >>> >> >> > >> >>> >> >> > You can also look at the simple history as well to get an idea >> >>> >> >> > what >> >>> >> >> > MCF >> >>> >> >> > is >> >>> >> >> > doing for your job for session handling. >> >>> >> >> > >> >>> >> >> > Thanks, >> >>> >> >> > Karl >> >>> >> >> > >> >>> >> >> > >> >>> >> >> > On Thu, Jul 7, 2016 at 4:35 AM, jetnet >> >>> >> >> > wrote: >> >>> >> >> >> >> >>> >> >> >> Hi All, >> >>> >> >> >> >> >>> >> >> >> I've been trying to setup a session-based auth sequence for a >> >>> >> >> >> forked >> >>> >> >> >> MediaWiki site (Wiki connector does not work with this >> >>> >> >> >> version), >> >>> >> >> >> but >> >>> >> >> >> somehow got stuck with the configuration. >> >>> >> >> >> The idea is to index the site using its sitemap.xml with >> >>> >> >> >> hops=1. >> >>> >> >> >> The >> >>> >> >> >> "public" version (user not logged in) of the sitemap.xml >> >>> >> >> >> contains a >> >>> >> >> >> different set of links as the "authenticated" one (user >> >>> >> >> >> logged >> >>> >> >> >> in). >> >>> >> >> >> The current auth sequence looks like this (the job's seeding >> >>> >> >> >> URL=http://wikisite/sitemap.xml): >> >>> >> >> >> >> >>> >> >> >> 1) the first call to the seeding URL should be redirected to >> >>> >> >> >> the >> >>> >> >> >> login >> >>> >> >> >> page >> >>> >> >> >> Login URL regexp: sitemap.xml >> >>> >> >> >> Page type: content >> >>> >> >> >> Identification regular expression: > >>> >> >> >> "public" >> >>> >> >> >> version> >> >>> >> >> >> Override target URL: /Special:UserLogin >> >>> >> >> >> >> >>> >> >> >> 2) enter user's credentials on the login page >> >>> >> >> >> Login URL regexp: Special:UserLogin >> >>> >> >> >> Page type: form >> >>> >> >> >> Override form parameters: username=someuser, password=******, >> >>> >> >> >> returntourl=http://wikisite/sitemap.xml >> >>> >> >> >> >> >>> >> >> >> 3) the login page ***should*** redirect back to the seeding >> >>> >> >> >> URL >> >>> >> >> >> with >> >>> >> >> >> the authorized content >> >>> >> >> >> Login URL regexp: /Special:UserLogin >> >>> >> >> >> Page type: redirection >> >>> >> >> >> Identification regular expression: /sitemap.xml >> >>> >> >> >> >> >>> >> >> >> From the log-file I can see, that first 2 steps work fine - >> >>> >> >> >> the >> >>> >> >> >> public >> >>> >> >> >> content gets recognized, the form data get sent, the >> >>> >> >> >> session's >> >>> >> >> >> cookies >> >>> >> >> >> get set. But the 3rd step returns the "public" version of the >> >>> >> >> >> sitemap.xml again, and the login process is getting stuck in >> >>> >> >> >> a >> >>> >> >> >> loop. >> >>> >> >> >> Am I on the right way or did I miss something? >> >>> >> >> >> >> >>> >> >> >> here is the log for the 3rd step: >> >>> >> >> >> >> >>> >> >> >> INFO 2016-07-06 22:52:27,285 (Worker thread '43') - WEB: >> >>> >> >> >> FETCH >> >>> >> >> >> >> >>> >> >> >> >> >>> >> >> >> LOGIN|http://wikisite/Special:UserLogin|1467838347082+203|302|153| >> >>> >> >> >> DEBUG 2016-07-06 22:52:27,285 (Worker thread '43') - WEB: >> >>> >> >> >> Tried >> >>> >> >> >> to >> >>> >> >> >> match raw url 'http://wikisite/sitemap.xml' >> >>> >> >> >> DEBUG 2016-07-06 22:52:27,285 (Worker thread '43') - WEB: >> >>> >> >> >> Tried >> >>> >> >> >> to >> >>> >> >> >> match cooked url 'http://wikisite/sitemap.xml' >> >>> >> >> >> DEBUG 2016-07-06 22:52:27,285 (Worker thread '43') - WEB: >> >>> >> >> >> Redirection >> >>> >> >> >> link lookup matched 'http://wikisite/sitemap.xml' >> >>> >> >> >> DEBUG 2016-07-06 22:52:27,285 (Worker thread '43') - WEB: >> >>> >> >> >> Document >> >>> >> >> >> 'http://wikisite/Special:UserLogin' matches preferred >> >>> >> >> >> redirection, >> >>> >> >> >> so >> >>> >> >> >> determined to be login page for sequence 'wikisite' >> >>> >> >> >> DEBUG 2016-07-06 22:52:27,394 (Worker thread '43') - WEB: >> >>> >> >> >> Waiting >> >>> >> >> >> for >> >>> >> >> >> an HttpClient object >> >>> >> >> >> DEBUG 2016-07-06 22:52:27,394 (Worker thread '43') - WEB: For >> >>> >> >> >> http://wikisite/sitemap.xml, setting virtual host to wikisite >> >>> >> >> >> DEBUG 2016-07-06 22:52:27,394 (Worker thread '43') - WEB: Got >> >>> >> >> >> an >> >>> >> >> >> HttpClient object after 0 ms. >> >>> >> >> >> DEBUG 2016-07-06 22:52:27,394 (Worker thread '43') - WEB: Get >> >>> >> >> >> method >> >>> >> >> >> for '/sitemap.xml' >> >>> >> >> >> DEBUG 2016-07-06 22:52:27,394 (Worker thread '43') - WEB: >> >>> >> >> >> Adding >> >>> >> >> >> 2 >> >>> >> >> >> cookies for '/sitemap.xml' >> >>> >> >> >> DEBUG 2016-07-06 22:52:27,394 (Worker thread '43') - WEB: >> >>> >> >> >> Cookie >> >>> >> >> >> '[version: 0][name: PHPSESSID][value: >> >>> >> >> >> 1vnhgi0f84dc9pi6eaoj0nau45][domain: wikisite][path: >> >>> >> >> >> /][expiry: >> >>> >> >> >> null]' >> >>> >> >> >> added >> >>> >> >> >> DEBUG 2016-07-06 22:52:27,394 (Worker thread '43') - WEB: >> >>> >> >> >> Cookie >> >>> >> >> >> '[version: 0][name: authtoken][value: >> >>> >> >> >> >> >>> >> >> >> 920_636034351472613318_616a5fd45ce4d5fed6c5318d73b38070][domain: >> >>> >> >> >> wikisite][path: /][expiry: Wed Jul 13 22:52:27 CEST 2016]' >> >>> >> >> >> added >> >>> >> >> >> DEBUG 2016-07-06 22:52:35,660 (Worker thread '43') - WEB: >> >>> >> >> >> Retrieving >> >>> >> >> >> cookies... >> >>> >> >> >> DEBUG 2016-07-06 22:52:35,660 (Worker thread '43') - WEB: >> >>> >> >> >> Cookie >> >>> >> >> >> '[version: 0][name: PHPSESSID][value: >> >>> >> >> >> vqfpr88pqa6d62nl6h4lp03nu1][domain: wikisite][path: >> >>> >> >> >> /][expiry: >> >>> >> >> >> null]' >> >>> >> >> >> DEBUG 2016-07-06 22:52:35,660 (Worker thread '43') - WEB: >> >>> >> >> >> Cookie >> >>> >> >> >> '[version: 0][name: authtoken][value: >> >>> >> >> >> >> >>> >> >> >> 920_636034351472613318_616a5fd45ce4d5fed6c5318d73b38070][domain: >> >>> >> >> >> wikisite][path: /][expiry: Wed Jul 13 22:52:27 CEST 2016]' >> >>> >> >> >> INFO 2016-07-06 22:52:37,004 (Worker thread '43') - WEB: >> >>> >> >> >> FETCH >> >>> >> >> >> >> >>> >> >> >> LOGIN|http://wikisite/sitemap.xml|1467838347394+9610|200|683773| >> >>> >> >> >> DEBUG 2016-07-06 22:52:37,004 (Worker thread '43') - WEB: >> >>> >> >> >> Document >> >>> >> >> >> 'http://wikisite/sitemap.xml' is text, with encoding 'utf-8'; >> >>> >> >> >> link >> >>> >> >> >> extraction starting >> >>> >> >> >> DEBUG 2016-07-06 22:52:37,019 (Worker thread '43') - WEB: >> >>> >> >> >> Document >> >>> >> >> >> 'http://wikisite/sitemap.xml' matches content, so determined >> >>> >> >> >> to >> >>> >> >> >> be >> >>> >> >> >> login page for sequence 'wikisite' >> >>> >> >> >> >> >>> >> >> >> >> >>> >> >> >> Thank you! >> >>> >> >> >> regards, Konstantin >> >>> >> >> > >> >>> >> >> > >> >>> >> > >> >>> >> > >> >>> > >> >>> > >> >> >> >> >> > > >