incubator-clerezza-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bertrand Delacretaz <bdelacre...@apache.org>
Subject Re: SPARQL query performance/scalability
Date Tue, 15 Jun 2010 11:32:53 GMT
Hi Reto,

Thanks very much for you reply.

On Tue, Jun 15, 2010 at 11:07 AM, Reto Bachmann-Gmuer
<reto.bachmann@trialox.org> wrote:
> ...The approach of putting triples in an ArrayList seems the limiting factor
> and I don't understand why you do this. TcManager is as scalable as the
> underlying storage backend (i.e. tdb or sesame)....

Ok - as I said I just copied that code from an example.

How should I apply a sparql query to the complete set of triples of a
given TcManager efficiently?

IIUC it's something like

  MGraph defaultGraph = new UnionMGraph(...)
  ResultSet rs = tcManager.executeSparqlQuery(query, defaultGraph);

but I'm not sure what the defaultGraph exactly means - could I use
null if the sparql query contains a FROM clause, and would that be
better?

>
> The clerezza sparql endpoint doesn't need to have the triples in memory. You
> can have a UnionGraph of multiple graphs even from different backends
> without any data actually being copied around...

Ok, so I guess what I'm missing is how to construct that UnionGraph
efficiently, while allowing all triples to be queried.

>
> The performance of a sparql query may depend on the order of the where
> clauses. For queries against a set of graphs from the same storage endpoint
> we will soon implement a transparent fastlane to the sparql-endpoint of the
> storage provider (if the storage provider supports sparql), but in your case
> as you are querying against a UnionGraph (and not against named actually
> stored graphs) this will not make any difference.

Ok, cool, thanks for the info!
-Bertrand

Mime
View raw message