manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karl Wright <daddy...@gmail.com>
Subject Re: Sharepoint Crawl - Missing documents
Date Mon, 04 Mar 2019 11:41:18 GMT
Hi Gaurav,
There is no document count threshold value.
If you can identify libraries or subsites that aren't being crawled, you
can turn on connector debugging to see why the connector is skipping them.
There could be many reasons for a library or site to be skipped, e.g. bad
specification rules, or permissions insufficient to read them.

Karl


On Mon, Mar 4, 2019 at 4:03 AM Gaurav G <goyalgauravg@gmail.com> wrote:

> Hi,
>
> We are trying to crawl a Sharepoint list with about 150,000 items and a
> library with about 125,000 documents.
> We have separate jobs for both. The list job only crawls about 50000 items
> and completes cleanly while the library job crawls about 40000 documents
> and completes cleanly.
> We are trying to figure out why we are not getting the complete list. Is
> there a threshold value beyond which the crawling doesn't happen.
> For smaller repos (<30000 items) we are not facing any issue. Those get
> crawled completely.
>
> Thanks,
> Gaurav
>
>

Mime
View raw message