incubator-clerezza-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Spicar <>
Subject Re: Weak Performance of "application/json+rdf" serializer on big TripleCollections (CLEREZZA-643)
Date Wed, 26 Oct 2011 13:37:47 GMT
the JIRA issue can be found here:

On Wed, Oct 26, 2011 at 3:36 PM, Daniel Spicar <> wrote:

> Rupert provided a patch to improve serialization performance (thanks for
> the effort!). I reviewed his Patch and have written my comments on the JIRA
> page. But I think we need to discuss the issues I raise there. In summary:
> - neither the patch nor the current implementations work reliably with very
> large graphs (larger than memeory)
> - the patch is significantly faster than the current implementation
> - the current implementation is easier to quick-fix for very large graphs
> (but also very slow)
> There is a sketch of a better solution that should allow us to be faster
> and not limited by memory size. It is based on sorted iterators. However
> these iterators need to be supplied by the underlying TripleCollections and
> that will require more changes to the core of Clerezza.
> Because both, the current implementation and the patch doe not really work
> on "big" TripleCollection (when big means really really big) the question we
> should discuss its:
> a) keep everything as it is and solve the problem properly (possibly as
> described in the issue)
> b) quick fix the current implementation (slow performance)  + schedule a
> proper solution
> c) apply the patch (fast but graphs limited to available memory size) +
> schedule a proper solution
> My favorite is c.
> What do you think?

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message