Return-Path: X-Original-To: apmail-manifoldcf-dev-archive@www.apache.org Delivered-To: apmail-manifoldcf-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 1B2181736F for ; Tue, 4 Nov 2014 18:16:34 +0000 (UTC) Received: (qmail 13367 invoked by uid 500); 4 Nov 2014 18:16:34 -0000 Delivered-To: apmail-manifoldcf-dev-archive@manifoldcf.apache.org Received: (qmail 13337 invoked by uid 500); 4 Nov 2014 18:16:33 -0000 Mailing-List: contact dev-help@manifoldcf.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@manifoldcf.apache.org Delivered-To: mailing list dev@manifoldcf.apache.org Received: (qmail 13326 invoked by uid 99); 4 Nov 2014 18:16:33 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 04 Nov 2014 18:16:33 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) Received: from [209.85.220.171] (HELO mail-vc0-f171.google.com) (209.85.220.171) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 04 Nov 2014 18:16:26 +0000 Received: by mail-vc0-f171.google.com with SMTP id lf12so4984844vcb.30 for ; Tue, 04 Nov 2014 10:15:20 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:content-type; bh=C8MaOFeG/HLkwug8b48Q3qJbBuueyNnrC/8aG/X/hvk=; b=b9MlbtkLSJj/131fzW9RQEuaU8ZT2dLCaPFGzCmSSfbW0B1PsMo4Kc5RFIwveCVQOl 3vXK5Bv5OfFhv3ffSKhDjVgFxlmCW8QwZakdOGZkWkTp8WI9NQTDf+FxgxSGBYjNT5H6 ua2Ko82w9DEd1gPbOO2+Fj9CcgEBwfp0pnWk85OgmTsnE7xhySw3+X5mFEdxGplTdvUg ZGvPO1HuQHzmutKzqF69mZMPfrSlWt/byII2d6lVHN9RnJ7TKkRZnJ10JVkuTvHTpGFl 2cCzSof1H+TLp4dBobdZ/HVQCpw4yUv6TecZ/h52MvJS+/CqFfTQn9W5ajGSi5iuO5eR 5QmQ== X-Gm-Message-State: ALoCoQmlHXVa8tCJTjfiQA5jjwPhnYO7DB/AV0m1VJfwImrg+Vo/YihB1JL4F71NV3d1Iv+qisYT MIME-Version: 1.0 X-Received: by 10.52.117.51 with SMTP id kb19mr1415539vdb.24.1415124920252; Tue, 04 Nov 2014 10:15:20 -0800 (PST) Received: by 10.221.7.193 with HTTP; Tue, 4 Nov 2014 10:15:20 -0800 (PST) X-Originating-IP: [217.37.166.138] In-Reply-To: References: Date: Tue, 4 Nov 2014 18:15:20 +0000 Message-ID: Subject: Re: (Continuous) crawl performance From: Aeham Abushwashi To: dev@manifoldcf.apache.org Content-Type: multipart/alternative; boundary=bcaec548586e2f4f7505070c723a X-Virus-Checked: Checked by ClamAV on apache.org --bcaec548586e2f4f7505070c723a Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Hi Karl, After applying the 1.7.2 revisions for CONNECTORS-1090, -1091, -1092 and -1093 to my 1.6.1 branch, if I create a new crawl, then its documents get picked up by the next scan; however, that doesn't happen for an existing crawl. The docpriority for documents in the existing craw is still at 1000000001. I believe the priority should be set by ManifoldCF#resetAllDocumentPriorities but it's not, because JobManager#getNextNotYetProcessedReprioritizationDocuments returns no rows to update, which I think is due to the legacy job's docs having a priorityset of NULL. Replacing the current priorityset condition in JobManager#getNextNotYetProcessedReprioritizationDocuments with (priorityset IS NULL OR priorityset