Return-Path: X-Original-To: apmail-manifoldcf-user-archive@www.apache.org Delivered-To: apmail-manifoldcf-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 793D210A44 for ; Fri, 8 Nov 2013 20:46:27 +0000 (UTC) Received: (qmail 45455 invoked by uid 500); 8 Nov 2013 20:46:27 -0000 Delivered-To: apmail-manifoldcf-user-archive@manifoldcf.apache.org Received: (qmail 45425 invoked by uid 500); 8 Nov 2013 20:46:27 -0000 Mailing-List: contact user-help@manifoldcf.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@manifoldcf.apache.org Delivered-To: mailing list user@manifoldcf.apache.org Received: (qmail 45417 invoked by uid 99); 8 Nov 2013 20:46:27 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 08 Nov 2013 20:46:27 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of mlibucha@gmail.com designates 209.85.128.174 as permitted sender) Received: from [209.85.128.174] (HELO mail-ve0-f174.google.com) (209.85.128.174) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 08 Nov 2013 20:46:22 +0000 Received: by mail-ve0-f174.google.com with SMTP id pa12so1847399veb.33 for ; Fri, 08 Nov 2013 12:46:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=mgU7Cu3nPiXYMUFxqFqvYjuuNxAScVZr1u2Jz7YpExo=; b=eHQPJqfJkzERMFopeZ3arE46NJZ9J54Qra1JrSET2P7of8Q4ros0ryPL9r5T5svP63 N5U9rYCW0SmtgCN89/CyENBcv2BD3If2frU7xqBlYJqGs6/H1A7IS6ZLxtWXtGdxqbbD E9upm1MZqcWAmOH0jLLO3TuWjY4/Tz7wFauEngA9HNUlH0VIsE/2Ww/uJiZEOq093ID4 JcRVhLR5ue/HwPStOJAshYtmcqbTWcfIxyGxZxBnyCVNBI5uhoCfKBUuec1sg4F2S5Xr 7xn9tfMxiekRwXAJz/E+gV7AiGZPZ6kI2+4jMN41WP+vSi4rcyfFg6DKMpeTdNj/icqI n/6g== MIME-Version: 1.0 X-Received: by 10.52.227.6 with SMTP id rw6mr11495741vdc.19.1383943561259; Fri, 08 Nov 2013 12:46:01 -0800 (PST) Received: by 10.58.235.36 with HTTP; Fri, 8 Nov 2013 12:46:01 -0800 (PST) Date: Fri, 8 Nov 2013 12:46:01 -0800 Message-ID: Subject: Web crawl not completing From: Mark Libucha To: "user@manifoldcf.apache.org" Content-Type: multipart/alternative; boundary=089e011616605bb44404eab0780e X-Virus-Checked: Checked by ClamAV on apache.org --089e011616605bb44404eab0780e Content-Type: text/plain; charset=ISO-8859-1 My web crawl does not complete. The UI gets stuck at the end, showing that 4 documents are still Active. In my logs, 4 URIs show warnings, like this: WARN 2013-11-08 15:09:04,439 (Worker thread '48') - Pre-ingest service interruption reported for job 1383941193567 connection 'web': Timed out waiting for response for 'http://myhost/somefile': The target server failed to respond Not always 4 files, some times more, sometimes less, and not always the same files. My output connector never gets the job completed callback. Any suggestions? Thanks, Mark --089e011616605bb44404eab0780e Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
My web crawl does not complete. The UI gets stuc= k at the end, showing that 4 documents are still Active.

In my logs,= 4 URIs show warnings, like this:

=A0WARN 2013-11-08 15:09:04,439 (W= orker thread '48') - Pre-ingest service interruption reported for j= ob 1383941193567 connection 'web': Timed out waiting for response f= or 'http://myhost/somefile':= The target server failed to respond

Not always 4 files, some times more, sometimes less, and not= always the same files.

My output connector ne= ver gets the job completed callback.

Any suggestions?

Thanks,

Mark


--089e011616605bb44404eab0780e--