manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dileepa Jayakody <djayak...@zaizi.com>
Subject Re: Multi-process ManifoldCF with Zookeeper
Date Wed, 04 Nov 2015 06:45:05 GMT
Thank you Karl for the info.

The task I'm setting up is indexing about ~15Million small documents in
mcf. As the load is distributed largely via the database, I think the
database should be tweaked.
I will follow the performance tuning tips given in [1] to tune postgres
with manifold for my purpose. If you have any more suggestions on database
tuning/postgres sharding please share your ideas.

Thanks and Regards,
Dileepa

[1]
https://manifoldcf.apache.org/release/release-2.1/en_US/performance-tuning.html

On Wed, Nov 4, 2015 at 11:54 AM, Karl Wright <daddywri@gmail.com> wrote:

> Hi Dileepa,
>
> When you have multiple agents, each agent takes on part of the load.  The
> load is distributed largely via the database; each agent has some body of
> threads which take tasks off the database queue and work on them.  When the
> tasks are done, the agent takes more tasks.  Since tasks are not assigned
> to specific agents in advance, no special allowance need be made to
> distribute workload equally among agents; this happens automatically in
> that slower agents simply wind up doing less.
>
> Karl
>
>
> On Tue, Nov 3, 2015 at 11:57 PM, Dileepa Jayakody <djayakody@zaizi.com>
> wrote:
>
>> Hi All,
>>
>> I am trying out the multi-process example with zookeeper in mcf 2.2 [1]
>> and need some clarifications on how multiple mcf agents (zk client
>> processes) work together when performing a manifold job. I'm a newbie to
>> zookeeper hence my question might be vague, but would like to see your
>> comments on it.
>>
>>
>> When I invoke a manifold job from the UI, is it executed by a single mcf
>> process (agent) or is it shared by multiple agents to distributed the load?
>> If multiple agents share the job, how is the load distributed?
>>
>> Any resources, pointers on how to configure a multi-process manifoldcf to
>> distribute jobs is much appreciated.
>>
>>
>> Thanks,
>> Dileepa
>>
>> [1]
>> https://manifoldcf.apache.org/release/trunk/en_US/how-to-build-and-deploy.html
>>
>> ------------------------------
>> This message should be regarded as confidential. If you have received
>> this email in error please notify the sender and destroy it immediately.
>> Statements of intent shall only become binding when confirmed in hard copy
>> by an authorised signatory.
>>
>> Zaizi Ltd is registered in England and Wales with the registration number
>> 6440931. The Registered Office is Brook House, 229 Shepherds Bush Road,
>> London W6 7AN.
>>
>
>

-- 

------------------------------
This message should be regarded as confidential. If you have received this 
email in error please notify the sender and destroy it immediately. 
Statements of intent shall only become binding when confirmed in hard copy 
by an authorised signatory.

Zaizi Ltd is registered in England and Wales with the registration number 
6440931. The Registered Office is Brook House, 229 Shepherds Bush Road, 
London W6 7AN. 

Mime
View raw message