incubator-jena-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paolo Castagna <>
Subject Re: [jena-dev] Building RDF Schema information from TDB Dataset [ ARQ, TDB ]
Date Wed, 05 Oct 2011 11:16:28 GMT
Dave Reynolds wrote:
> On Wed, 2011-10-05 at 11:22 +0100, Paolo Castagna wrote: 
>> Dave Reynolds wrote:
>>> If you just want to list the properties and classes that are used then
>>> you can do things like:
>>>   SELECT DISTINCT ?p WHERE {?s ?p ?o.}
>>>   SELECT DISTINCT ?cls WEHRE {?i a ?cls.}
>> Any idea to speed up these two queries (for large TDB datasets) is welcome! :-)
> I nearly put a comment in that response that those can be very expensive
> queries :)


The fact is that often people do not put vocabularies|ontologies in their data
(rightly so). They also want to have a list of properties and classes actually
used in a dataset, and they want counts (i.e. how many times a property or a
class has been actually used in a dataset).

The list of properties (with counts) can be derived from the stats.opt file
(if present).

Maybe people wanting to have super fast list of distinct properties|classes
actually used in a dataset should have a custom index just for that.
However, one would need to intercept all updates operation and keep those
indexes in sync with TDB indexes. I have no idea on what is the best way to
do that.

Are there better ideas?


> Dave

View raw message