2010/5/27 Anuj Saini <anuj.saini@orkash.com>
> You are trying to generate clusters of similar artifacts. Though this can
> be done at processing time, but better approach is to keep the annotated
> results in database. My suggestion is use index for fast retrieval.
>
> UIMA does'nt provide anything special to do this, but you can use
> lucene/solr to achieve it. There is a feature "MoreLikeThis" in lucene/solr
> which is very handy to find out related articles.
>
For this purpose also the Lucas sandbox project [1] could be useful to write
what you extracted using UIMA (a CAS) on a Lucene index.
Cheers,
Tommaso
[1] : http://uima.apache.org/sandbox.html#lucas.consumer
|