incubator-jena-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stephen Allen <>
Subject Re: Model/Graph for a remote SPARQL endpoint
Date Sun, 12 Feb 2012 20:21:04 GMT
On Sat, Feb 11, 2012 at 12:42 PM, Andy Seaborne <> wrote:
> On 11/02/12 02:04, Stephen Allen wrote:
>> Is there a preferred way to represent a remote SPARQL endpoint in a manner
>> that is interchangeable with a local model/graph?  I have code that
>> generates RDF using Jena objects, and I want to store them in either a
>> remote or local graph store (configurable).
> SPARQL? I.e. don't work in terms of API calls but in terms of SPARQL.  A
> simple DatasetGraph can wrap local/remote issues.

Yeah, I guess I was mostly getting hung up on transactions spanning
multiple queries / inserts.

>> I see there is a GraphSPARQLService, but it doesn't appear finished (also,
>> I think I would want a DatasetGraph level interface).
> It's read-only and work by mapping all Graph.find to a remote query.
>> Also I note the DatasetGraphAccessorHTTP in Fuseki, which I believe is an
>> implementation of "SPARQL 1.1 Graph Store HTTP Protocol" [1].  This looks
>> close to what I want, but forces you to add a dependency on Fuseki, and it
>> does not have streaming support for inserts or a natural API that accepts
>> Jena objects.
> DatasetGraphAccessorHTTP is supposed to migrate sometime.
> What do you mean by "natural API that accepts Jena objects"?  The grap store
> protocol is put/get of whole graphs only.
> If the graph is small, then get - do some operations - put might be
> workable. This has transaction semantics (use etags).  Just depends on the
> size of the graph.
> And if not, maybe a design where changes are accumulated locally, then
> enacted at the end all at once on the potentially remote graph (with local
> mirror?).

Creating Jena objects and then having to serialize them to Turtle in
order to insert into a SPARQL Update query seemed like step that could
be made a little easier.  I thought at first a DatasetGraph wrapper
around an endpoint might work, but I don't think it's quite the
correct interface.  It isn't well suited for a streaming add
operation, as each add() seems to imply a commit.  Also, for querying,
SPARQL seems like the better interface than find().

I'll think about what I'm trying to do a little more.  Although the
RepositoryConnection looks a lot like what I was imagining.  Biggest
issue is managing the transactions, this is where I thought we might
need to extend the SPARQL protocol.  But even without transaction
support, the interface seems useful.

>> Basically I think I'm looking for a Connection object similar to JDBC or
>> Sesame's RepositoryConnection [2].  You could connect to either a local
>> DatasetGraph or a remote endpoint.  For the remote endpoint case, I don't
>> think it's possible to accomplish fully with standard SPARQL because of
>> two
>> issues: 1) no transaction support across multiple queries/updates and 2)
>> local blank nodes identifiers.
>> Does anyone have any ideas?  Should I start designing such a thing?  The
>> blank node problem could be solved with skolemization [3], and we could
>> initially ignore the transaction issue (thus support Auto Commit mode
>> only).
> Bnodes can be handled by enabling bNode label output, and using <_:...> on
> input.
> But do think about whether bNodes are the right thing to use in the first
> place.
>> To add transaction support, we would have to add an extension to SPARQL
>> 1.1
>> Protocol [4].  An extra parameter to indicate the type of transaction
>> (READ
>> or WRITE, transaction level) and transaction ID seems like it might be a
>> good approach. The transaction ID could be a client generated UUID, which
>> would save a round-trip.  Or maybe a cookie would be a better approach?
> Yes.
> Alternative: E-Tags give transactions for consistency.  No abort though and
> client crash has no roll back but its the web way.

I hadn't heard of E-Tags, I'll take a look!


View raw message