uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Deejay <dee...@binarytweed.com>
Subject Clustering, Collapsing
Date Fri, 08 Jun 2012 15:44:47 GMT
Hi all,

I recently discovered Apache UIMA, and it looks like a very large project! I 
was hoping that someone more experienced with it than I could comment on 
whether there are parts of the project that could help with my problem.

I need to go over many millions of objects (Protocol Buffers in HBase, as it
happens), and cluster them according to their similarity. Once each cluster is
formed, I need to 'collapse' each property of the objects to find the most 
prevalent value. After this, the collapsed object will be added to a Solr 
index.

Would any part of Apache UIMA be useful for the clustering or collapsing, or 
have I misunderstood the nature of the project?


Mime
View raw message