stanbol-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rupert Westenthaler <>
Subject Re: Contenthub structure
Date Thu, 02 Jun 2011 08:54:15 GMT
Hi all

I will try to create a small usage Szenario here:

A user posts a query for "CMS workshops in France" to the Contenthub:

The semantic Search component of the Contenthub uses several
SeachEngines (like EnhancementEngines in the Enhancer).

1. OntologySearcher: It tries to identify Concepts mentioned in the
Search. For the example it will find the Concpet "Workshop"
2. EntitySearcher: It tries to find Entities for words used in the
Query. For the example it will find "France"
3. Faceted Search engine: It will compose a Lucene type search for
Documents with
 * a reference Workshop
 * a reference to France
 * the text "CMS"

If there would be an other Search engine that can understand internal
structure of the query one could even search for things
* with the type Workshop
* located within Paris
* the text "CMS"
and because Workshops are events one could activate Facets for
* Location
* Time
* Participants
* facets explicitly requested with the query (e.g. Tags, Creator ...)

So the Idea is to use

* Ontologies (CMS-Adapter & Kres)
* Entityhub
* maybe neuronal networks with learned query patterns??
* other stuff??

for query preprocessing and

* full text indices over Documents
* full text indices over Facts (like the Workshop)
* SPARQL endpoints over Enhancements
* other things??

for the execution of the enhances query.

Joining results from the different sources (Documents, Facts,
Enhancements) would be challenging. However I think this feature would
not be necessary for a first version.

I would also like to consider this
[Screencast]( in the
context of this Usage Scenario.


On Wed, Jun 1, 2011 at 10:26 AM, Olivier Grisel
<> wrote:
> 2011/6/1 Suat Gonul <>:
>> Hi everbody,
>> After discussing with Rupert yesterday, we have come up with a basic design
>> for the Contenthub component.
>> It will provide two main RESTful interface to:
>> 1) Upload (register) content and metadata (Available in current
>> implementation)
>> 2) Search for registered content
>> There would be Indexing Engines for (1) and Search Engines for (2). The
>> Contenthub implementation would then implement Indexing Engines to store the
>> enhancements in a triple store and Search Engines to search enhancements and
>> content items in triple store.
>> There is also an already started implementation for the search part in
>> google code base of IKS project at [1]. It will be integrated to the
>> Contenthub component.
>> What do you think?
> I think the default search implementation for content should be based
> on fulltext indexing using the EntityHub's SolrYard extended with
> faceted search.
> I find fulltext search + structure facet based structured refinements
> combo much more intuitive than the traditional multi-fields form based
> search interface.
> --
> Olivier
> -

| Rupert Westenthaler   
| Bodenlehenstraße 11                             ++43-699-11108907
| A-5500 Bischofshofen

View raw message