incubator-stanbol-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rupert Westenthaler <rupert.westentha...@gmail.com>
Subject Re: dbpedia solr index dump
Date Fri, 10 Aug 2012 05:58:42 GMT
Hi


On Fri, Aug 10, 2012 at 1:28 AM, harish suvarna <hsuvarna@gmail.com> wrote:
> Thanks Rupert for the update.
> Meanwhile I am looking at generating custom vocab index page
> http://incubator.apache.org/stanbol/docs/trunk/customvocabulary.html and
> trying to know which files I have to use under dbpedia chinese download
> available at http://downloads.dbpedia.org/3.8/zh/

Are this the data for the Entities with the URIs
"http://zh.dboedua.org/resource/{name}"?

Anyway cool that dbpedia 3.8 got finally released!

>
> The dbpedia download for chinese has article categories, lables, short/long
> abstracts, inter language links. Donot know which ones to use for the
> stanbol entityhub custom vocabulary index tool.

For linking concepts you need only the labels. If you also include the
short abstracts you will also have the mouse over text in the Stanbol
Enhancer UI. Geo coordinates are needed for the map in the enhancer
UI.

You should also include the data providing the rdf:types of the
Entities. However I do not know which of the files does include those.

Categories are currently not used by Stanbol. If you want to include
them you should add (1) the categories (2) categories labels and (3)
article categories

Note that there is an own Entityhub Indexing Tool for dbpedia
{stanbol-trunk}/entityhub/indeing/dbpedia.


best
Rupert

>
> -harish
>
>
> On Thu, Aug 9, 2012 at 11:08 AM, Rupert Westenthaler <
> rupert.westenthaler@gmail.com> wrote:
>
>> Hi
>>
>> the dbpedia 3.7 index was build by ogrisel so I do not know the details.
>>
>> I think Chinese (zh) labels are included, but the index only contains
>> Entities for Wikipedia pages with 5 or more incoming links.
>>
>> In addition while  the English DBpedia contains zh labels it will not
>> contain Entities that do not have a counterpart in the English
>> Wikipedia.
>>
>> best
>> Rupert
>>
>> On Thu, Aug 9, 2012 at 1:00 AM, harish suvarna <hsuvarna@gmail.com> wrote:
>> > I received a USB in IKS conf which contained the 1.19GB of dbpedia full
>> > solr index. Does it contain the data from the chinese dump (available in
>> > the dbpedia.org download server under zh folder)?
>> >
>> > I do get some dbpedia entries for chinese text in stanbol enhancements. I
>> > am using the 1.19GB dump. I am expecting some more enhancements which are
>> > present  in wikipedia chinese. Just wondering if chinese dump is not
>> > utilized.
>> >
>> > -harish
>>
>>
>>
>> --
>> | Rupert Westenthaler             rupert.westenthaler@gmail.com
>> | Bodenlehenstra├če 11                             ++43-699-11108907
>> | A-5500 Bischofshofen
>>



-- 
| Rupert Westenthaler             rupert.westenthaler@gmail.com
| Bodenlehenstra├če 11                             ++43-699-11108907
| A-5500 Bischofshofen

Mime
View raw message