incubator-jena-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paolo Castagna <>
Subject Re: SPARQL endpoints and high CPU usage when a client goes away...
Date Fri, 04 Mar 2011 20:25:14 GMT
Andy Seaborne wrote:
> On 03/03/11 18:09, Paolo Castagna wrote:
>> Hi,
>> we have noticed a high CPU usage when a client send a SPARQL query
>> which returns a large result set and the client goes away when we
>> are streaming back the response.
>> I think I might have found the reason why the high CPU usage persist
>> even after the client went away.
>> We use ResultSetFormatter.outputAsXML(OutputStream outStream,
>> ResultSet qresults) which is then using IndentedWriter:
>> simple.http.MonitoredOutputStream.ensureOpen()
>> simple.http.MonitoredOutputStream.write(byte[], int, int)
>> simple.http.ResponseStream.write(byte[], int, int)
>> com.hp.hpl.jena.sparql.resultset.XMLOutputResultSet.printResource(Resource) 
>> com.hp.hpl.jena.sparql.resultset.XMLOutputResultSet.printBindingValue(RDFNode) 
>> com.hp.hpl.jena.sparql.resultset.XMLOutputResultSet.binding(String,
>> RDFNode)
>> com.hp.hpl.jena.sparql.resultset.ResultSetApply.apply()
>> com.hp.hpl.jena.sparql.resultset.XMLOutput.format(OutputStream, 
>> ResultSet)
>> com.hp.hpl.jena.query.ResultSetFormatter.outputAsXML(OutputStream,
>> ResultSet, String)
>> com.hp.hpl.jena.query.ResultSetFormatter.outputAsXML(OutputStream,
>> ResultSet)
>> If an IOException is generated, for example, because the client which
>> we were streaming the response to went away, the exception is swallowed
>> by the write method in IndentedWriter:
>> private void write(char ch)
>> { try { out.write(ch) ; } catch (IOException ex) {} }
>> The problem does not depend on the particular web framework used.
> It's high CPU usage because, while the underlying stream exists, there 
> is I/O going on - remove the I/O and you get no brake on the output and 
> it's taking a lot of exceptions at a point when normal very little is 
> done.  A CPU loop.

Yep, indeed (by the way, we could have found this by inspection... but
we used YourKit ( to find what the busy CPU(s)
were doing. I have not checked, but things like PMD or FindBugs might
even raise warnings for similar things.

I am going to apply for a new YourKit license for the work I do with
Jena. If other Jena committers want one, please, let me know ASAP.

> It could throw an (ARQ type) exception.  i.e. a RuntimeException.  The 
> stack of checked IOExceptions is fine in theory but I find checked 
> exceptions don't really work out in practice for error handling.  OK, 
> and sensible, for alternative return modelling, but then the catcher is 
> the immediate caller.  For errors, the catcher might be many levels up 
> the stack and so every method merely adds "throws IOEXception" which 
> devalues the whole point of the declaration.


Thank you for sliding in the change we needed while you were working
on UTF-8 stuff:

svn diff -r8570:8571

      private void write(char ch)
-    { try { out.write(ch) ; } catch (IOException ex) {} }
+    { try { out.write(ch) ; } catch (IOException ex) { throw new AtlasException(ex) ; } }

> Related: IndentedWriter uses a PrintWriter to warp the OutputStream. 
> There is (for UTF-8 only) a hopefully optimized BufferingWriter in 
> Atlas.  It might be faster and in a way that shows up in system 
> performance.

Less synchronization... good, good. :-)

Not completely related, but recently I've noticed that when multiple
threads are concurrently hitting TDB directly or Fuseki via HTTP requests,
the threads serving those requests are for most of their time in a BLOCKED
state... however I have not yet found exactly the reason why this happens
and if it's normal or not. I am planning, once again, to use YourKit to
investigate that.

> Both changes made.

Thank you. I appreciate.

> If you could, could you try the SVN dev version?  It will throw a 
> runtime exception, wrapping the IOException.

We had no time to try this today, it has been a rather unusual and
interesting [1,2] day at Talis today. ;-)

But, we will put it under test on Monday and report back if we see
problems (I doubt that, I am pretty sure everything is fine with the


>> I think Fuseki is affected by the same problem, but I have not time to
>> replicate
>> it with Fuseki yet.
> It will be as it just uses ResultSetFormatter.

Two birds with one stone.


>> Am I missing something here?
> No - good to fix this.
>     Andy
>> Thank you,
>> Paolo

View raw message