incubator-jena-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andy Seaborne (Commented) (JIRA)" <>
Subject [jira] [Commented] (JENA-127) Add RDF/JSON Parsing Support to RIOT
Date Sat, 08 Oct 2011 09:48:30 GMT


Andy Seaborne commented on JENA-127:

The RIOT output framework is less defined; "undefined" would be 
accurate.  For now, writing into Jena core is best as a Jena writer then 
go via a static function like writeRDFJSON(graph) and it'll get adapted 

Input efficiency is more important because it more directly affects 
users experience with large data.

Rob - the tokenizer is doing minimal lookahead.  Bytes->chars conversion 
is done in large chunks (by the Java library - I tried to short circuit 
it with a non-codepoint-checking version but it was not faster).

The tokens are quite simple - only one character look ahead is need 
except in the case of prefix names [*].  There's non geneal rexexps 
going on so it should be fast and I've profiled it heavily for N-triples 
and others - don't see any obvious hot spots or inefficiencies.

[*] The end of local name is  (CHARS|'.')* CHARS so there is a little 
dance to handle ".".  Does not affect RDF/JSON, and it's only at most 
one charactser pushback code, not general backtracking.

> Add RDF/JSON Parsing Support to RIOT
> ------------------------------------
>                 Key: JENA-127
>                 URL:
>             Project: Jena
>          Issue Type: New Feature
>          Components: ARQ, Jena, RIOT
>         Environment: All
>            Reporter: Rob Vesse
>            Assignee: Paolo Castagna
>            Priority: Minor
>              Labels: patch, rdf/json, riot
>         Attachments: ARQ-RDF-JSON-tests_r1179639.patch, ARQ_JENA-127_r1179358.patch,,, RdfJsonRiotPatch-ApacheSVN.patch, RdfJsonRiotPatch.patch,
> The attached patch provides a RDF/JSON (Talis Specification) parser for RIOT, the patch
is against ARQ trunk from the Jena SourceForge SVN repository
> It plugs in as an implementation of LangRIOT (named LangRDFJSON) and uses the existing
TokenizerJSON from the atlas package to do the tokenisation.  There is also a JenaReaderRdfJson
added as part of this patch which does what the name suggests.
> I have also included in this patch a set of unit tests which verify the parsers behaviour
with a variety of valid and invalid inputs.
> There are still some things to be addressed:
> - The patch includes registration of the Jena reader when SysRiot.writeIntoJena() is
called but does not unregister itself when resetJenaReaders() is called, should this be done?
> - Add a RDF/JSON writer - a separate patch will be submitted at a later date (likely
next week) for this
> Otherwise the patch is fairly comprehensive and I hope can be reviewed and included in
future releases
> EDIT - I have now redone the patch against Apache SVN as well and attached that as a
separate file since there are some differences in the structure of the two repos and some
minor code changes that mean the SourceForge SVN patch cannot be applied directly against
Apache SVN

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:!default.jspa
For more information on JIRA, see:


View raw message