uima-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jaroslaw Cwiklik <uim...@gmail.com>
Subject [PROPOSAL] Distributed UIMA Cluster Computing
Date Fri, 05 Oct 2012 16:41:50 GMT
Hi, UIMA Developers,

I've been working on an extension (with some other people) to UIMA that I'd
like to bring into the sandbox,
if others agree it would be a good addition.  We've been calling it DUCC,
for
Distributed UIMA Cluster Computing.  Before going into details, here's a
high-level description:

DUCC is a cluster management system providing tooling, management, and
scheduling facilities to automate the scale-out of applications written to
the UIMA framework.

Core UIMA provides a generalized framework for applications that process
unstructured
information such as human language, but does not provide a scale-out
mechanism.  UIMA-AS
provides a scale-out mechanism to distribute UIMA pipelines over a cluster
of computing
resources, but does not provide job or cluster management of the resources.
 DUCC completes the
set by providing job support, cluster management, and automation for the
scale-out of UIMA
applications over UIMA-AS on large computing clusters.

We have an initial implementation that has been used by one project;
we'd like to move this into the UIMA project for further development, and to
make it available to others.

Do you think this would be a worthwhile addition, and does it make sense to
bring it (initially) into
the Sandbox?

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message