stanbol-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alex Lopez <>
Subject Re: Categorization
Date Wed, 16 Mar 2011 12:53:31 GMT
Thank you Olivier,
we'll go on to test OpenCalais / Zemanta / SalsaDev separately for the 
time being.

Em 16-03-2011 11:20, Olivier Grisel escreveu:
> 2011/3/14 Alex Lopez<>:
>> Hi Stanbol devs,
>> we are working on a semantic app. that will/should do content categorization
>> and other stuff.
>> So I have some code that takes detected entities as input (in form of
>> dbpedia urls) and looks up in dbpedia both YAGO categories and wikipedia
>> categories (the ones that use to be skos:subject and now are dct:subject).
>> But of course this is tied to particular namespaces/service, I would like to
>> expand it and abstract the particulars while retaining the functionality,
>> something like a wrapper:
>> input: resource/resources/text (raw content)
>> output: categories/topics
>> Keep in mind I'm not looking for categories of the sort
>> Person/Organisation... more of the sort Science/Jazz Musicians etc
>> So I've been following Stanbol's devs list for some time, and I'm exited
>> about the possibilities, maybe I can use some of it for this particular
>> requeriment. Right now, I see it as a collection of services, for example I
>> can see zemanta doing what I want with DMOZ topics, included as an engine.
>> (are there other engines doing similar?)
>> But I wonder, is there a "central" place with methods/services doing this
>> for all implementations? or perhaps I misunderstood what the project is
>> about...
>> kind of a getTopicsForResource(){
>>     getDMOZ();
>>     getWikipediaCategories();
>>     getFreebaseCategories();
>>     ...
>> }
>> If not, what are good place to look for this kind of functionality so I can
>> include it in my method?
>> Now I understand this in an incubating project so perhaps this is a planned
>> feature, do you have any roadmap? Any expected date for a "first release" of
>> stanbol?
> Yes this is a planned feature. The existing
> RelatedTopicEnhancementEngine is to be reimplemented to use the entity
> hub index and to build predefined topic indexes out of the dbpedia
> skos hierarchy and the fulltext of the related articles (to be able to
> perform similarity queries using the MoreLikeThis feature of Solr).
> We also need to extend the Stanbol vocabulary to handle topics that
> are not entities.
> In the mean time you can use OpenCalais / Zemanta / SalsaDev directly.

View raw message