incubator-cvs mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Incubator Wiki] Trivial Update of "UimaProposal" by ThiloGoetz
Date Fri, 22 Sep 2006 16:25:57 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Incubator Wiki" for change notification.

The following page has been changed by ThiloGoetz:
http://wiki.apache.org/incubator/UimaProposal

------------------------------------------------------------------------------
  
  Unstructured Information Management applications are software systems that analyze large
volumes of unstructured information in order to discover knowledge that is relevant to an
end user.  We propose UIMA, a framework and SDK for developing such applications.  An example
UIM application might ingest plain text and identify entities, such as persons, places, organizations;
or relations, such as works-for or located-at.  UIMA enables such an application to be decomposed
into components, for example ''"language identification"'' -> ''"language specific segmentation"''
-> ''"sentence boundary detection"'' -> ''"entity detection (person/place names etc.)"''.
 Each component must implement interfaces defined by the framework and must provide self-describing
metadata via XML descriptor files.  The framework manages these components and the data flow
between them.  Components are written in Java or C++; the data that flows between components
is designed for efficient mapping between these languages.  UIMA additionally provides capabilities
to wrap components as network services, and can scale to very large volumes by replicating
processing pipelines over a cluster of networked nodes.
  
- This framework has already attracted a following among government, commercial, and academic
institutions who previously developed analysis algorithms, but were unable to easily build
on each other's works, and who want to be able to evolve their applications by independently
upgrading parts, as better technology becomes available.  Applications built with this framework
are being used with plain text, audio streams, and mage/video streams, identifying entities
and relations, converting speech to text, translating into different languages, and determining
properties of images.
+ This framework has already attracted a following among government, commercial, and academic
institutions who previously developed analysis algorithms, but were unable to easily build
on each other's works, and who want to be able to evolve their applications by independently
upgrading parts, as better technology becomes available.  Applications built with this framework
are being used with plain text, audio streams, and image/video streams, identifying entities
and relations, converting speech to text, translating into different languages, and determining
properties of images.
  
  The UIMA framework runs components in a flow, passing a common data object containing unstructured
information (free text, audio, video, etc.) through the components.  Each component examines
the unstructured information and data added by other components, and adds data of its own.
 The framework mandates a standardized form of the data being passed, and a standardized form
of the interfaces to the components.
  

---------------------------------------------------------------------
To unsubscribe, e-mail: cvs-unsubscribe@incubator.apache.org
For additional commands, e-mail: cvs-help@incubator.apache.org


Mime
View raw message