incubator-jena-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Claude Warren (Commented) (JIRA)" <>
Subject [jira] [Commented] (JENA-203) support for Non Blocking Parsers
Date Mon, 16 Apr 2012 10:18:22 GMT


Claude Warren commented on JENA-203:

I was pondering this problem recently and was wondering about creating a new poling iterator
class that returns True, False or NULL for hasNext().  The NULL being, no data yet.

The idea is that each endpoint would be a thread fronted by a polling iterator that would
plug into a poling iterator worker/pool/what-have-you the worker/pool/what-have-you would
poll the endpoints until it got a TRUE or FALSE.   On true it would return true for hasNext()
and next() would return the result from the same endpoint.  On false it would remove the endpoint
from the pool, after last endpoint is removed it returns FALSE.  On NULL it would move onto
the next endpoint in the pool cycling back to the start when it reached the end.

This should allow results from slower endpoints to be intermixed with results from faster
endpoints and should increase the speed (decrease the time) to get all results.
> support for Non Blocking Parsers
> --------------------------------
>                 Key: JENA-203
>                 URL:
>             Project: Apache Jena
>          Issue Type: Improvement
>            Reporter: Henry Story
> In a Linked Data environment servers have to fetch data off the web. The speed at which
such data 
> is served can be very slow. So one wants to avoid using up one thread for each connections
(1 thread = 
> 0.5 to 1MB approximately). This is why Java NIO was developed and why servers such as
> are so popular, why http client libraries such as
are more
> and more numerous, and why framewks such as which support relatively
> actors (500 bytes per actor) are growing more viisible.
> Unless I am mistaken the only way to parse some content is using methods that use an

> InputStream such as this:
>     val m = ModelFactory.createDefaultModel()
>      m.getReader(lang.jenaLang).read(m, in, base.toString)
> That read call blocks. Would it be possible to have an API which allows
> one to parse a document in chunks as they arrive from the input?

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:!default.jspa
For more information on JIRA, see:


View raw message