manifoldcf-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Karl Wright (JIRA)" <j...@apache.org>
Subject [jira] [Created] (CONNECTORS-1249) Keep a separate document priority queue per job, and synchronize with any running jobs on job start
Date Sat, 31 Oct 2015 05:35:27 GMT
Karl Wright created CONNECTORS-1249:
---------------------------------------

             Summary: Keep a separate document priority queue per job, and synchronize with
any running jobs on job start
                 Key: CONNECTORS-1249
                 URL: https://issues.apache.org/jira/browse/CONNECTORS-1249
             Project: ManifoldCF
          Issue Type: Improvement
          Components: Framework crawler agent
    Affects Versions: ManifoldCF 2.2
            Reporter: Karl Wright
            Assignee: Karl Wright
             Fix For: ManifoldCF 2.3


Starting a job when there has been already a long-running job in MCF takes a very long time,
because the documents from the new job don't get processed until the other jobs' current backlog
at the time the new job was started go away.

Effectively, this is because there is only one stream of document priorities, and all jobs
tap into that.  But there's no reason why we can't have multiple document priority streams,
one per active job, with some redesign work.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message