On Fri, Dec 5, 2008 at 8:20 PM, Greg Holmberg <holmberg2066@comcast.net> wrote:
> I seem to remember that IBM's CAS Consumer for indexing into their semantic search engine
had to solve the same problem. I think it was configurable in a file, if I remember correctly.
>
> Perhaps one of the IBM folks could describe what was done there?
>
Yes, that's right. There's a separate file that contains the
configuration rules for the indexer. This is described in the UIMA
documentation:
http://incubator.apache.org/uima/downloads/releaseDocs/2.2.2-incubating/docs/html/tutorials_and_users_guides/tutorials_and_users_guides.html#ugr.tug.application.integrating_text_analysis_and_search
However, the search engine that is used for this (available on IBM
alphaWorks) is able to index annotations over spans of text, which
AFAIK Lucene is not.
-Adam
|