incubator-clerezza-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andy Seaborne <>
Subject Re: leak but where after parsing rdf files?
Date Tue, 25 Jan 2011 10:14:22 GMT

On 25/01/11 09:33, Hasan Hasan wrote:
> Hi Andy,
> thanks for taking a look at the code.
> This means that there is a limit to the number of triples with large
> literals that can be returned by jenaGraph.find(). Right?

Not in the design - Graph.find() returns a streaming iterator from TDB. 
if the application is keeping the triples returned, then it takes space, 
RDF terms are materialized to return them - there is no delayed 
evaluation there.

But once the iterator from Graph.find has returned a triples, it's not 
in TDB at all.  There is an issue with how the node table cache might 
grow because of large literals in it, but is is limited to a maximum 
number of entries.  Turn the cache size down.

> If this limit is
> exceeded, then it can lead to outofmemoryerror exception. And this limit
> depends on max memory allocated for heap, the size of literals ?

And the size of the cache.

> So to see whether there is a memory leak, I could try to loop over
> jenaGraph.find() where in each iteration there shouldn't be a heap memory
> exception.

If the heap is big enough for the cache.  The worst case is pretty big.

> I'll test it now and let you know.
> But we'll consider your suggestion to not have large literals in the
> triples, but their references.
> Cheers
> Hasan

let me know how it goes,


View raw message