manifoldcf-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Karl Wright (JIRA)" <>
Subject [jira] [Commented] (CONNECTORS-1562) Document removal Elastic
Date Mon, 10 Dec 2018 11:24:00 GMT


Karl Wright commented on CONNECTORS-1562:

[~SteenTi], good that the scheduler is working as expected.

Next I edited the seeds and deleted some links and let the job run scheduled again.
There were 0 Deletions and the Simple History also showed 0 deletion messages.

The scheduler doesn't have any impact on the way a job runs, unless you tell it to do a "minimal"
run rather than a "complete" one.  There's a pulldown for every schedule record you create
that lets you decide which it's going to be.  What is selected for your schedule record?

Also, were you able to see deletions when you follows my steps above?

> Document removal Elastic
> ------------------------
>                 Key: CONNECTORS-1562
>                 URL:
>             Project: ManifoldCF
>          Issue Type: Bug
>          Components: Elastic Search connector, Web connector
>    Affects Versions: ManifoldCF 2.11
>         Environment: Manifoldcf 2.11
> Elasticsearch 6.3.2
> Web inputconnector
> elastic outputconnecotr
> Job crawls website input and outputs content to elastic
>            Reporter: Tim Steenbeke
>            Assignee: Karl Wright
>            Priority: Critical
>              Labels: starter
>         Attachments: Screenshot from 2018-12-05 09-01-46.png
>   Original Estimate: 4h
>  Remaining Estimate: 4h
> My documents aren't removed from ElasticSearch index after rerunning the changed seeds
> I update my job to change the seedmap and rerun it or use the schedualer to keep it runneng
even after updating it.
> After the rerun the unreachable documents don't get deleted.
> It only adds doucments when they can be reached.

This message was sent by Atlassian JIRA

View raw message