incubator-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thilo Goetz <>
Subject Re: Proposal for a new incubation project: Unstructured Information Management Architecture - UIMA
Date Sat, 26 Aug 2006 10:44:59 GMT
Hi David,

we have some sample components today.  For example, we have wrappers 
around some of the OpenNLP tools ( to 
make them available as UIMA components.

Also, as I mentioned in my an answer to Ian, we would like to create 
something like the Lucene sandbox for the development of UIMA 
components.  Almost all text processing needs some basic functionality, 
such as segmentation and sentence detection, so it would be a good idea 
to have these available from and developed on Apache.

We have a whole boat load of sample applications that let you feed 
documents to your UIMA instance and then visualize the results in some 
way or other.  Those are more for demonstration and debugging purposes, 

 From an application perspective, we have great hopes for a cooperation 
with the Lucene project.  Even today, so-called semantic search is a 
main application area of UIMA.  The basic idea of semantic search is 
that you can search for information that is not explicitly contained in 
the text, and UIMA is a good basis to create that extra information - 
but that's only half the story.  You then also need a search engine that 
can index that extra information and make it available for search.  An 
application package where you can simply plug in your UIMA entity 
detection (for example) and you have a full semantic search application 
would be very attractive, I believe.

That's more of a mid-term plan, though, as it would also require some 
changes to Lucene.

I've rambled a bit, but I hope somewhere in what I said is an answer to 
your question (the short answer being "yes" ;-).


David Welton wrote:
>> > What does it *do*?
>> I believe it is basically a big, pluggable, harness
> Harness - will it be able to do something out of the box as a
> demonstration of its capabilities?

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message