Return-Path: X-Original-To: apmail-manifoldcf-dev-archive@www.apache.org Delivered-To: apmail-manifoldcf-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id C182F9A30 for ; Mon, 15 Dec 2014 12:22:13 +0000 (UTC) Received: (qmail 4245 invoked by uid 500); 15 Dec 2014 12:22:13 -0000 Delivered-To: apmail-manifoldcf-dev-archive@manifoldcf.apache.org Received: (qmail 4191 invoked by uid 500); 15 Dec 2014 12:22:13 -0000 Mailing-List: contact dev-help@manifoldcf.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@manifoldcf.apache.org Delivered-To: mailing list dev@manifoldcf.apache.org Received: (qmail 4180 invoked by uid 99); 15 Dec 2014 12:22:13 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 15 Dec 2014 12:22:13 +0000 Date: Mon, 15 Dec 2014 12:22:13 +0000 (UTC) From: "Karl Wright (JIRA)" To: dev@manifoldcf.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Created] (CONNECTORS-1122) Explore ways to make job start be faster in systems with lots of documents MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 Karl Wright created CONNECTORS-1122: --------------------------------------- Summary: Explore ways to make job start be faster in systems with lots of documents Key: CONNECTORS-1122 URL: https://issues.apache.org/jira/browse/CONNECTORS-1122 Project: ManifoldCF Issue Type: Improvement Components: Framework crawler agent Affects Versions: ManifoldCF 1.8, ManifoldCF 2.0 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 1.9, ManifoldCF 2.1 Job start requires all documents to be marked as needing reprioritization now. We should consider ways in which we can reduce the need to do this as much as possible. For example, if there are NO documents at all for a job, reprioritization is by definition unneeded. Alternatively, coming up with a way of determining if there are any bin-level overlaps between documents made active by a job start at documents elsewhere, we could be more targeted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)