manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kambiz Niktabar <nikta...@yahoo.com>
Subject Re: Wiki connector stuck crawling namespaces other than default
Date Wed, 01 Oct 2014 13:50:21 GMT
Hi Karl,

Snapshot of the job view page is attached. By the way, it seems the number of pages under
that namespace is only 27 and they are not being processed even after some minutes (see the
second snapshot)

Regards
Kambiz


________________________________
 From: Karl Wright <daddywri@gmail.com>
To: "user@manifoldcf.apache.org" <user@manifoldcf.apache.org>; Kambiz Niktabar <niktabar@yahoo.com>

Sent: Wednesday, October 1, 2014 2:05 PM
Subject: Re: Wiki connector stuck crawling namespaces other than default
 


Hi Kambiz,

The debugging output indicates that your namespace name is "404".  That doesn't sound correct
to me.

>>>>>>
GET /wiki/api.php?format=xml&action=query&list=allpages&apnamespace=404&apfrom=Africa%3ATetianCarbonates&aplimit=500
HTTP/1.1
<<<<<<

I've gone back and looked at the code and can find no way that the namespace would be corrupted.
 But maybe this is actually correct.  Can you send along a screen shot of the view page for
the job?


Also, the wiki connector seeds documents in batches of 500 at a time.  It uses the last title
fetched in order to be able to find the next batch of 500.  So if there are a lot of documents,
it will take a while to seed them all.  In your log I see signs that this is what is happening.
 Have a look at all the GET requests and note the apfrom parameter.





Thanks,
Karl
Mime
View raw message