stanbol-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From <arthi.ven...@wipro.com>
Subject RE: Working with large RDF data
Date Tue, 17 Sep 2013 13:30:03 GMT
Thanks Rupert,
I have few clarifications / queries which I have asked inline below.

"It should be possible to reason over the enhancement results and store all triples (including
the deduced one) in Jena TDB. After that you can use SPARQL on the Jena TDB as suggested by
Reto. However note that any change in the Ontology will not be reflected in the Jena TDB -
as there is not truth maintenance."

Iam currently  storing RDF in a separate SDB outside Stanbol.   Is there a way this / TDB
 can be stored and managed as  part of the Stanbol.?
For the problem of stale triples I could refresh entire store on change of Ontology.


"If the data does fit into memory you just store the plain RDF data, load them into an reasoning
session to get the results. After that you can store the results in an other RDF store (e.g.
Jena TDB) for later queries."
How do you load RDF data into a session.  I could see a way to load an Ontology into a session
 but not RDF instances. 


"IMO if you need reasoning support over the whole knowledge base you should use a System that
natively supports it. While the above workflows would allow to mimic such functionality it
will become unpractical as the amount of data grows."
I will evaluate some other stores to be used along with Stanbol say Virtuoso , etc to see
if this limitation can be overcome.


Thanking you and Regards,
Arthi

-----Original Message-----
From: Rupert Westenthaler [mailto:rupert.westenthaler@gmail.com] 
Sent: Tuesday, September 17, 2013 12:46 PM
To: dev@stanbol.apache.org
Subject: Re: Working with large RDF data

Hi

It should be possible to reason over the enhancement results and store all triples (including
the deduced one) in Jena TDB. After that you can use SPARQL on the Jena TDB as suggested by
Reto. However note that any change in the Ontology will not be reflected in the Jena TDB -
as there is not truth maintenance.

If the data does fit into memory you just store the plain RDF data, load them into an reasoning
session to get the results. After that you can store the results in an other RDF store (e.g.
Jena TDB) for later queries.

IMO if you need reasoning support over the whole knowledge base you should use a System that
natively supports it. While the above workflows would allow to mimic such functionality it
will become unpractical as the amount of data grows.

best
Rupert




On Mon, Sep 16, 2013 at 3:29 PM, Reto Bachmann-Gmür <reto@wymiwyg.com> wrote:
> Why in memory? TDB based clerezza store is quite efficient, so why not 
> add the data to such a graph?
>
> reto
>
>
> On Sat, Sep 14, 2013 at 9:14 AM, <arthi.venkat@wipro.com> wrote:
>
>> Thanks a lot Rupert
>> If the RDF data is smaller ( can fit into memory ) is there a way we 
>> can import into Stanbol and do a joint search across the enhancements 
>> from unstructured text as well as the imported RDF data.
>> If yes would this import be permanent or needs to be repeated each time.
>>
>>
>> Thanks and Rgds,
>> Arthi
>>
>>
>> -----Original Message-----
>> From: Rupert Westenthaler [mailto:rupert.westenthaler@gmail.com]
>> Sent: Saturday, September 14, 2013 12:40 PM
>> To: dev@stanbol.apache.org
>> Subject: Re: Working with large RDF data
>>
>> Hi Arthi
>>
>> AFAIK the reasoning and rule components of Apache Stanbol are 
>> intended to be used in "Sessions". They are not intended to be used 
>> on a whole knowledge base. A typical use case could be validating RDF 
>> data retrieved from a remote Server (e.g. Linked Data) against some validation rules.
>> Rewriting RDF generated by the Enhancer (Refactor
>> Engine) ...
>>
>> Applying Rules and Reasoning on a whole knowledge base (RDF data that 
>> do not fit in-memory) is not a typical use case.
>>
>> Based on your problem description you might want to have a look onto
>>
>> * Apache Marmotta and the Kiwi Triple Store
>> (http://marmotta.incubator.apache.org/kiwi/introduction.html): This 
>> is a Sesame Sail implementation that supports reasoning
>> * OWLLIM (http://www.ontotext.com/owlim): Commercial product also 
>> implementing Reasoning on top of the Sesame API.
>>
>> But I am not an export in those topics so there might be additional 
>> options I am not aware of.
>>
>> hope this helps
>> best
>> Rupert
>>
>>
>> On Fri, Sep 13, 2013 at 1:48 PM,  <arthi.venkat@wipro.com> wrote:
>> > Hi,
>> >
>> >   I have large RDF data.
>> >
>> >    The requirement is to be able to reason / run rules on this data 
>> > /
>> >
>> > search this data along with any other unstructured data which  I 
>> > have
>> enhanced using  Stanbol.
>> >
>> >
>> >
>> > Any pointers on how I can achieve this?
>> >
>> >
>> >
>> >
>> >
>> > Thanking you and Rgds,
>> >
>> > Arthi
>> >
>> >
>> >
>> >
>> > Please do not print this email unless it is absolutely necessary.
>> >
>> > The information contained in this electronic message and any 
>> > attachments
>> to this message are intended for the exclusive use of the 
>> addressee(s) and may contain proprietary, confidential or privileged 
>> information. If you are not the intended recipient, you should not 
>> disseminate, distribute or copy this e-mail. Please notify the sender 
>> immediately and destroy all copies of this message and any attachments.
>> >
>> > WARNING: Computer viruses can be transmitted via email. The 
>> > recipient
>> should check this email and any attachments for the presence of viruses.
>> The company accepts no liability for any damage caused by any virus 
>> transmitted by this email.
>> >
>> > www.wipro.com
>>
>>
>>
>> --
>> | Rupert Westenthaler             rupert.westenthaler@gmail.com
>> | Bodenlehenstraße 11                             ++43-699-11108907
>> | A-5500 Bischofshofen
>>
>> Please do not print this email unless it is absolutely necessary.
>>
>> The information contained in this electronic message and any 
>> attachments to this message are intended for the exclusive use of the 
>> addressee(s) and may contain proprietary, confidential or privileged 
>> information. If you are not the intended recipient, you should not 
>> disseminate, distribute or copy this e-mail. Please notify the sender 
>> immediately and destroy all copies of this message and any attachments.
>>
>> WARNING: Computer viruses can be transmitted via email. The recipient 
>> should check this email and any attachments for the presence of viruses.
>> The company accepts no liability for any damage caused by any virus 
>> transmitted by this email.
>>
>> www.wipro.com
>>



-- 
| Rupert Westenthaler             rupert.westenthaler@gmail.com
| Bodenlehenstraße 11                             ++43-699-11108907
| A-5500 Bischofshofen

Please do not print this email unless it is absolutely necessary. 

The information contained in this electronic message and any attachments to this message are
intended for the exclusive use of the addressee(s) and may contain proprietary, confidential
or privileged information. If you are not the intended recipient, you should not disseminate,
distribute or copy this e-mail. Please notify the sender immediately and destroy all copies
of this message and any attachments. 

WARNING: Computer viruses can be transmitted via email. The recipient should check this email
and any attachments for the presence of viruses. The company accepts no liability for any
damage caused by any virus transmitted by this email. 

www.wipro.com
Mime
View raw message