incubator-jena-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rob Vesse (Created) (JIRA)" <>
Subject [jira] [Created] (JENA-178) SPARQL Results serialization and parsing is slow with large result sets
Date Tue, 13 Dec 2011 23:13:30 GMT
SPARQL Results serialization and parsing is slow with large result sets

                 Key: JENA-178
             Project: Jena
          Issue Type: Bug
          Components: ARQ
    Affects Versions: ARQ 2.8.9
         Environment: Windows 7 Enterprise 64 bit
            Reporter: Rob Vesse

The SPARQL XML and JSON Result formats are very slow when the result set is large.  This is
surprising to me since both formats are relatively simple and should lend themselves to fairly
fast streaming serialization and parsing.

The following are observed performance figures comparing SPARQL XML, SPARQL JSON and SPARQL
TSV results format.  This is the averaged time over 5 runs to retrieve the first 50,000 triples
from the dataset with a simple SELECT * WHERE { ?s ?p ?o } LIMIT 50000 via a HTTP request
to Fuseki and iterate over the results on the client.

SPARQL XML = 15.25 seconds
SPARQL JSON = 10.9 seconds
SPARQL TSV = 0.54 seconds

Now obviously TSV is way simpler to serialize and parse than XML/JSON but these serializers
and parsers should not be 20-30 times slower IMO

Also for comparison note that doing an equivalent CONSTRUCT { ?s ?p ?p } WHERE { ?s ?p ?o
} LIMIT 50000 takes only about 2s and that is using RDF/XML serialization which I would have
expected to be slower because RDF/XML is more complex to generate than either SPARQL XML/JSON
results.  I haven't dived into the code in detail to investigate why this is slow yet but
do the Jena team have any thoughts on this?

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:!default.jspa
For more information on JIRA, see:


View raw message