manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karl Wright <daddy...@gmail.com>
Subject Re: Needed information on multi-process-zookeeper setup
Date Wed, 28 Mar 2018 13:05:48 GMT
Hi Vinay,

The agents process is the one that does the crawling.
The tomcat serves the UI, API, and Authority Service web applications.

MCF requires its connectors to be bounded in memory consumption.  That
means that once you determine how much memory you need, the consumption
isn't going to depend on the size of documents being crawled etc.  Adding
more memory beyond that point, therefore, is not going to help you.  The
amount of memory you need does, however, scale with the number of worker
threads.

Karl


On Wed, Mar 28, 2018 at 8:57 AM, VINAY Bengaluru <vinaybs.20@gmail.com>
wrote:

> Hi,
>       Currently I have a zookeeper setup along with two tomcat instances
> added to the cluster. We are using FileSystemConnector along with TIKA
> transformer and indexing data to Solr. The Load is quite high. We have
> hundreds of thousands of documents. My question is
> 1. We Run the start-agent in the multiprocess-zk-example. What exactly
> does this "start-agnet" do?
> 2. We also run the tomcat process.
>
> So which of the above does the actual tasks of running jobs and which one
> should get higher memory configuration(heap)? Tomcat or start-agent?
>
> Thanks and regards,
> Vinay B S
>

Mime
View raw message