jena-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andy Seaborne <andy.seabo...@epimorphics.com>
Subject Re: Mulgara: dessert topping or floor wax?
Date Fri, 10 Dec 2010 09:15:03 GMT


On 10/12/10 00:23, Benson Margulies wrote:
> The big driver was scale. The materials my folks looked at left them
> with the impression that Jena+SQL wouldn't handle big data sets with
> adequate performance, and Mulgara would.

How big is a big data set?

We have several storage systems: the original RDB, which certainly does 
not scale and is not recommended for new applications, and SDB, which 
scales better (call it mid-size) but not as well as TDB (non-SQL-backed).

TDB is not transactional though (it could be made transaction ... hint 
to world ...

Either:
1/ Use BDB : https://github.com/afs/TDB-BDB
2/ Write a transactional block manager for TDB.

2 is a much bigger task.  More fun though.  Could do block compression 
at the same time.

...)

> Rereading the trail they
> followed, I'm a bit uncertain about the conclusion. They also got the
> idea that we could talk to Mulgara via Jena, and the more I read, the
> more confused I get about _that_.

There is a proof-of-concept:

http://seaborne.blogspot.com/2008/01/jena-mulgara-example-of-implementing.html

This is pre-TDB - at the time, it was a possible route to scalable 
storage for Jena but it didn't work out.

You can talk to Mulgara with Sesame.

(and Jena to Sesame:
https://github.com/afs/JenaSesame
)

My github projects aren't on the JenaProposal to keep the focus on the 
core system but there is no reason we can't also add them to the apache 
project.

> For this new Fuseki thing, do I talk to it over the Jena API, and if so how?

The real way to talk to any system is either local API or SPARQL 
protocol over the wire. The first SPARQL working group was the "RDF Data 
Access Working Group" (DAWG) -- emphasis on the data access.  Protocol 
and query langauge.  So if it's a server, then SPARQL.

Not the solution for all data access for the semantic web (see also the 
Linked data API: http://code.google.com/p/linked-data-api/) SPARQL is a 
systems building block.

TDB is an embedded database (same process).

You can talk to it via the Jena API (in ARQ for query and update).

Fuseki is a server that provides access using SPARQL protocols (all 3 of 
them, query, update, REST).

	Andy

Mime
View raw message