uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From holmberg2...@comcast.net (Greg Holmberg)
Subject Re: Lucene cas consumer
Date Sat, 06 Dec 2008 01:20:11 GMT

 -------------- Original message ----------------------
From: "Roberto Franchini" <ro.franchini@gmail.com>

> So we need a very highly configurable component, able to map only
> certain declared features and applying the right analyzer and so on.
> Mny ways are possible:
> -completly programmatic: the indexer is abstract and should be
> extended to implement the right mapping for a specialized typeSytem
> and pipeline
> -configurable: mapping rules are defined in a descriptor file; the
> JENA component followed this way
> -mix of the two: some mapping is configured, other are implemented

I seem to remember that IBM's CAS Consumer for indexing into their semantic search engine
had to solve the same problem.  I think it was configurable in a file, if I remember correctly.

Perhaps one of the IBM folks could describe what was done there?

A separate question: what kinds of annotations is it possible to index into Lucene?  In other
words, what functionality are we shooting for?

For example, can I index named entities?  In my case, named entities look like that attached
UML class diagram.  I would like to perform queries for documents that contain certain entities
or types of entities.  For example, find documents that contain entity name=IBM, type=Company.

Greg Holmberg

  • Unnamed multipart/mixed (inline, None, 0 bytes)
View raw message