manifoldcf-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Karl Wright (JIRA)" <>
Subject [jira] [Commented] (CONNECTORS-1562) Document removal Elastic
Date Thu, 06 Dec 2018 18:55:00 GMT


Karl Wright commented on CONNECTORS-1562:

Hi [~SteenTi], the only thing I have not been able to verify is whether the ES connector is
working properly or not.  What I'd like you to do is set up your sample job in such a way
so that it is small enough to crawl in a small amount of time -- and use the Null output connector
rather than the ES one.  Please then make sure you know how to execute the web crawl jobs
and make sure you see the same things I saw above.  Once you get to that point, we can verify
whether or not ES is doing the right thing.

Thanks again.

> Document removal Elastic
> ------------------------
>                 Key: CONNECTORS-1562
>                 URL:
>             Project: ManifoldCF
>          Issue Type: Bug
>          Components: Elastic Search connector, Web connector
>    Affects Versions: ManifoldCF 2.11
>         Environment: Manifoldcf 2.11
> Elasticsearch 6.3.2
> Web inputconnector
> elastic outputconnecotr
> Job crawls website input and outputs content to elastic
>            Reporter: Tim Steenbeke
>            Assignee: Karl Wright
>            Priority: Critical
>              Labels: starter
>         Attachments: Screenshot from 2018-12-05 09-01-46.png
>   Original Estimate: 4h
>  Remaining Estimate: 4h
> My documents aren't removed from ElasticSearch index after rerunning the changed seeds
> I update my job to change the seedmap and rerun it or use the schedualer to keep it runneng
even after updating it.
> After the rerun the unreachable documents don't get deleted.
> It only adds doucments when they can be reached.

This message was sent by Atlassian JIRA

View raw message