incubator-jena-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Henry Story (Commented) (JIRA)" <>
Subject [jira] [Commented] (JENA-203) support for Non Blocking Parsers
Date Thu, 01 Mar 2012 09:43:57 GMT


Henry Story commented on JENA-203:

I am not sure what is the best way to change the Jena API for non blocking parsers, nor if
anything needs to be done (yet). Essentially the way these parsers work is that one should
able to parse chunks of data, get some partial results (a small set of triples) and feed that
to a  Jena graph or store. Feeding it to a Jena Graph, or popping statements into a store
one at a time is  not a problem. So the XML parser I did above shows that it can be done with
the jena rdf/xml parsers, and the turtle parser shows how one can do it with other frameworks
that use Jena: after all the Turtle parser tests can add triples to Jena or Sesame graphs.

But I think consciousness of this problem should help guide the direction of your thinking
when developing new parsers, or what is needed to work with linked data in  an efficient way.

Out of doing this a few times an API will probably emerge.

Currently I have a simple blocking interface API for the non blocking parser

we all know this API. I need to find out how people in the actors community do this, and see
what kind of pattern they agree is good. If I find that
I'll post that here. Perhaps that will lead to some ideas of what such a pattern looks like.

(The NTriples file moved. Here is the current snapshot link, which should be a permalink
, but won't necessarily be the most up to date one )

I'll keep you posted on further developments. I should try using these parsers in a real scenario
soon, so I'll soon know how well this holds up.

> support for Non Blocking Parsers
> --------------------------------
>                 Key: JENA-203
>                 URL:
>             Project: Apache Jena
>          Issue Type: Improvement
>            Reporter: Henry Story
> In a Linked Data environment servers have to fetch data off the web. The speed at which
such data 
> is served can be very slow. So one wants to avoid using up one thread for each connections
(1 thread = 
> 0.5 to 1MB approximately). This is why Java NIO was developed and why servers such as
> are so popular, why http client libraries such as
are more
> and more numerous, and why framewks such as which support relatively
> actors (500 bytes per actor) are growing more viisible.
> Unless I am mistaken the only way to parse some content is using methods that use an

> InputStream such as this:
>     val m = ModelFactory.createDefaultModel()
>      m.getReader(lang.jenaLang).read(m, in, base.toString)
> That read call blocks. Would it be possible to have an API which allows
> one to parse a document in chunks as they arrive from the input?

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:!default.jspa
For more information on JIRA, see:


View raw message