uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lou DeGenaro <lou.degen...@gmail.com>
Subject Re: Scale out tuning for jobs
Date Thu, 16 Oct 2014 18:38:54 GMT
Amit,

DUCC should use all available resources as configured by your ducc.classes
and ducc.nodes files.

Lou.


On Thu, Oct 16, 2014 at 11:59 AM, Amit Gupta <gupta.v.amit@gmail.com> wrote:

> Thanks for the clarification Burn,
>
> So indeed there is no way to "force" a job to scale out to maximum
> resources available?
>
> What I'm finding is that even though a job takes > 1 hour to complete using
> 2 nodes, it doesnt use some extra available nodes which are part of the
> ducc cluster.
>
> a. Is there no configuration option to deal with this (I'm guessing this
> requirement may have come up before) ?
>
> b. Would you happen to know what part of UIMA code makes that decision (i.e
> the trigger to spawn a process on a new node or not) ?
>
>
> Thanks again for you help,
>
> Best,
> Amit
>
>
>
>
>
> On Thu, Oct 16, 2014 at 9:32 AM, Burn Lewis <burnlewis@gmail.com> wrote:
>
> > Yes, that parameter only limits the maximum scaleout.  DUCC will ramp up
> > the number of processors based on the available resources and the amount
> of
> > work to be done.  It initially starts only 1 or 2 and only when one
> > initializes successfully will it start more.  It may not start more if it
> > suspects that all the work will be completed on the existing nodes before
> > any new ones are ready.
> >
> > There is an additional type of scaleout, within each process, controlled
> by
> > --process_thread_count which controls how many threads in each process
> are
> > capable of processing separate work items.
> >
> > ~Burn
> >
> > On Wed, Oct 15, 2014 at 7:11 PM, Amit Gupta <gupta.v.amit@gmail.com>
> > wrote:
> >
> > > Hi,
> > > I've been trying to find the options related to configuration of
> scaleout
> > > of a ducc job.
> > >
> > > Thus far the only ones Ive found are:
> > >
> > > process_deployments_max:
> > > which limits the maximum number of processes spawned by a ducc job.
> > >
> > > At what point does DUCC decide to spawn a new process or spread
> > processing
> > > out to a new node. Is there a tuning parameter for an optimal number of
> > > work items per process spawned? Can the user control this behavior?
> > >
> > > For example,
> > > I have a job large enough that DUCC natively spreads it across 2 nodes.
> > > I havent been able to force this job, via a config parameter, to spread
> > > across 4 nodes (or "X" nodes) for faster processing times.
> > >
> > > Does anyone know if theres a parameter than can directly control
> scaleout
> > > in this manner?
> > >
> > > Thanks,
> > >
> > > --
> > > Amit Gupta
> > >
> >
>
>
>
> --
> Amit Gupta
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message