stanbol-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andy Seaborne <a...@apache.org>
Subject Re: How to add a new TripleCollection to Stanbol
Date Fri, 02 Nov 2012 20:21:19 GMT
The Clerezza parser is SPARQL 1.0 (from Mulgara) so no 1.1 features.

The version of Jena in Clerezza is Jena 2.6.4 (Dec 2010) and TDB 0.8.9 
(Jan 2011).  That version of TDB does not support transactions (and has 
various other bugs).

Andrea - yes, tdbloader is going to be much faster.  Data loaded via the 
HTTP interface is validated before inserting into the dataset, which 
means buffering and no tricks used by the loader.

	Andy

On 31/10/12 14:25, Rupert Westenthaler wrote:
> Hi
>
> AFAIK the Clerezza SPARQL implementation does not use the Graph
> specific SPARQL implementation. Because of that you are limited to
> what Clerezza supports and can not access additional features. This
> limitation is also the reason why I am interested in extending the
> STANBOL SPARQL endpoint to directly support Jena Datasets and possible
> even others (Sesame, Virtuoso ...) registered with the same metadata
> as currently supported for Clerezza TripleCollections.
>
> best
> Rupert
>
> On Wed, Oct 31, 2012 at 2:32 PM, Andrea Di Menna <andreadm@inqmobile.com> wrote:
>> Hi Rupert,
>>
>> thanks for your precious help.
>>
>> I am using the default graph hence I had to build a custom component.
>> After this was done I could access the TDB with Stanbol :-)
>>
>>  From what I can see though, the Clerezza SPARQL processor Stanbol is using
>> does not support aggregate functions like count.
>> Can you confirm? Is it possible to switch to ARQ for SPARQL queries?
>>
>> At the moment I am using Fuseki to handle queries as well (b.t.w. I
>> realised it was much much faster to build the TDB using tdbloader2 instead
>> of sending triples to Fuseki - dumb me, should have know before starting).
>>
>> Thanks for your great support!
>>
>> Cheers
>>
>> 2012/10/30 Rupert Westenthaler <rupert.westenthaler@gmail.com>
>>
>>> Hi
>>>
>>> To use an existing Jena TDB store with Apache Stanbol you need:
>>>
>>> 1. to make the Jena TDB store available in Apache Clerezza
>>> 2. configure a Stanbol Entityhub ClerezzaYard for your Graph URI
>>>
>>> ad1: Do you use named graphs or the TDB triple store? In In the
>>> SNAPSHOT version of "rdf.jena.tdb.storage"
>>> (org.apache.clerezza:rdf.jena.tdb.storage:0.6-incubating-SNAPSHOT)
>>> there is a SingleTdbDatasetTcProvider. It allows you to configure
>>> (e.g. via the Configuration tab of the Apache Felix WebConsole) the
>>> directory of the local file system where your TDB store is located. If
>>> you configure an instance with the location of your existing TDB
>>> store, than Clerezza should have access to the data. However this
>>> works only for named graphs (SPOC) and the union graph over all SPOC
>>> graphs. The SPO graph is not exposed by the
>>> SingleTdbDatasetTcProvider.
>>>
>>> ad2: As soon as you have your TDB store available in Clerezza you can
>>> configure ClerezzaYard instance(s) (e.g. via the Configuration tab of
>>> the Apache Felix WebConsole). Important is that the value of the
>>> "Graph URI" property refers to a Context (C) of your named graphs
>>> (SPOC) or to the URI of the union graph (as configured in the
>>> configuration of the SingleTdbDatasetTcProvider.
>>>
>>> The ClerezzaYard will automatically register the Clerezza MGraph with
>>> the Stanbol SPARQL endpoint.
>>>
>>>
>>> As an alternative you could also implement an own component that (1)
>>> opens the Jena TDB store (2) wraps the Jena graph with an Clerezza
>>> MGraph
>>>
>>> For that you create your own module and implement a a component
>>>
>>>      @Component(
>>>          configurationFactory=true,
>>>          policy=ConfigurationPolicy.REQUIRE, //the TDBpath is required!
>>>          specVersion="1.1",
>>>          metatype = true)
>>>       public class TdbGraphRegistering component
>>>
>>>          @Property
>>>          public static final String TDB_PATH = "jena.tdb.path";
>>>
>>> When your bundle starts OSGI will call the activate(..) method and
>>> deactivate(..) when it is stopped.
>>>
>>>      protected void activate(ComponentContext ctx) throws
>>> ConfigurationException {
>>>          String tdbPath = (String)ctx.getProperties().get(TDB_PATH)
>>>          if(tdbPath == null){
>>>              throw new ConfigurationException(TDB_PATH,"Jena TDB path
>>> MUST BE configured")
>>>          }
>>>
>>> So what you need to do is to initialize the Jena TDB store from the
>>> configured TDB_PATH create
>>> an Clerezza MGraph and register it as OSGI service
>>>
>>>       //Init the jena TDB model
>>>      com.hp.hpl.jena.rdf.model.Model model;
>>>
>>>      MGraph graph = new LockableMGraphWrapper(
>>>          new PrivilegedMGraphWrapper(new JenaGraphAdaptor(model)
>>>
>>> and than registering this MGraph to the OSGI ServiceRegistry (whitboard
>>> pattern)
>>>
>>>      Dictionary<String,Object> graphRegProp = new
>>> Hashtable<String,Object>();
>>>      //the URI under that you want to register your graph
>>>      graphRegProp.put("graph.uri", graphUri);
>>>      //optionally the name and description of the graph (used in the UI)
>>>      graphRegProp.put("graph.name", getConfig().getName());
>>>      graphRegProp.put("graph.description", getConfig().getDescription());
>>>      //now register the graph with the OSGI Service Registry
>>>      graphRegistration = context.getBundleContext().registerService(
>>>      TripleCollection.class.getName(), graph, graphRegProp);
>>>
>>> You will need to store the graphRegistration in a field and unregister
>>> it when your
>>> Component is deactivated.
>>>
>>>      proptected void deactivate(ComponentContext ctx){
>>>          if(graphRegistration != null){
>>>              graphRegistration.unregister();
>>>          }
>>>      }
>>>
>>> BTW: In the Apache Stanbol SPARQL Endpoint might get extended to
>>> directly support Jena Models registered by the same metadata.
>>>
>>> best
>>> Rupert
>>>
>>> On Mon, Oct 29, 2012 at 6:59 PM, Andrea Di Menna <andreadm@inqmobile.com>
>>> wrote:
>>>> Hi,
>>>>
>>>> my name is Andrea and I am a complete newbie for what regards Stanbol :)
>>>> I was wondering whether it is possible to add a new TripleCollection to
>>>> Stanbol.
>>>> I would like to query the triple store using the provided SPARQL
>>> endpoint.
>>>>
>>>> Reading the code I got that the SPARQL endpoint is already looking for
>>>> services which have a graph.uri property defined.
>>>> Although I cannot understand what kind of component I should add into
>>> Felix.
>>>>
>>>> At the moment I am building a Jena TDB using Fuseki (loading n-triples
>>> with
>>>> s-put), but I would like to have a single web service handling this.
>>>> Would be great if I could somehow "move" the generated TDB to Stanbol.
>>>>
>>>> Could you please give me any suggestion?
>>>>
>>>> Thanks
>>>>
>>>> --
>>>> Andrea Di Menna
>>>>
>>>>
>>>>
>>>>
>>>> This e-mail is only intended for the person(s) to whom it is addressed
>>> and may contain CONFIDENTIAL information. Any opinions or views are
>>> personal to the writer and do not represent those of INQ Mobile Limited,
>>> Hutchison Whampoa Limited or its group companies.  If you  are not the
>>> intended recipient, you are hereby notified that any use, retention,
>>> disclosure, copying, printing, forwarding or dissemination of this
>>> communication is strictly prohibited. If you have received this
>>>   communication in error, please erase all copies of the message and its
>>>   attachments and notify the sender immediately. INQ Mobile Limited is  a
>>> company registered in the British Virgin Islands. www.inqmobile.com.
>>>>
>>>
>>>
>>>
>>> --
>>> | Rupert Westenthaler             rupert.westenthaler@gmail.com
>>> | Bodenlehenstraße 11                             ++43-699-11108907
>>> | A-5500 Bischofshofen
>>>
>>
>>
>>
>> --
>> Andrea Di Menna
>> INQ - Engineering
>> +393925803119
>> skype: ninniux
>> inqmobile.com
>> INQ¹ – Winner of the 2009 Best Handset
>>
>>
>>
>>
>> This e-mail is only intended for the person(s) to whom it is addressed and may contain
CONFIDENTIAL information. Any opinions or views are personal to the writer and do not represent
those of INQ Mobile Limited, Hutchison Whampoa Limited or its group companies.  If you  are
not the intended recipient, you are hereby notified that any use, retention, disclosure, copying,
printing, forwarding or dissemination of this communication is strictly prohibited. If you
have received this  communication in error, please erase all copies of the message and its
 attachments and notify the sender immediately. INQ Mobile Limited is  a company registered
in the British Virgin Islands. www.inqmobile.com.
>>
>>
>
>
>


Mime
View raw message