stanbol-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Srecko Joksimovic" <sreckojoksimo...@gmail.com>
Subject Indexing and searching using Apache Stanbol
Date Mon, 12 Mar 2012 17:03:11 GMT
Hi,

 

Until now I have developed few applications for annotating documents using
Apache Stanbol. Now I need to add indexing and search capabilities.

I tried ContentHub
(http://incubator.apache.org/stanbol/docs/trunk/contenthub/contenthub5min)
in the way that I started full launcher and access web interface. There are
few possibilities: to provide text, to upload document, to provide an URI. I
tried to upload a few txt documents. I didn't get any extracted entities,
but search (using Web View) worked fine. Another step was to upload pdf
documents and I got extracted entities grouped by People, Places Concepts
categories. It was also in the list of recently uploaded documents, but I
couldn't find any term from that document.

 

I suppose that I will have to provide a stream from pdf (or any other kind)
documents and to index it like text? I need all mentioned functionalities
(index text, docs, URIs.) using Java application and I would appreciate a
code example, if it is available, please.

 

Thank you!

 

Srecko Joksimovic


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message