incubator-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andy Seaborne <andy.seabo...@epimorphics.com>
Subject Re: Clerezza, Stanbol, Jena, Semantic Commons, WDYT?
Date Tue, 09 Nov 2010 16:52:58 GMT


On 09/11/10 16:22, Florent Guillaume wrote:
> On Tue, Nov 9, 2010 at 1:19 PM, Andy Seaborne
> <andy.seaborne@epimorphics.com>  wrote:
>> Jeremy identified the IRI library as a potential contribution to a commons
>> area.  It is free-standing, and does not use or call any Jena RDF code - it
>> depends only on ICU4J (and JUnit + Jflex in the build).
>
> Please note that Abdera already has an IRI library.
> http://svn.apache.org/repos/asf/abdera/java/trunk/dependencies/i18n/src/main/java/org/apache/abdera/i18n/iri/

Florent,

Thanks for pointing that out.  I see it has a test suite as well and it 
would be good to make sure we've got things right.

Illegal IRIs in data have been a bit of a plague in RDF data and the IRI 
library (written by Jeremy) is a response to that.  It checks both rules 
for specific IRI schemes and also recommended forms as IRIs are often 
com pared for equality.  The library is quite picky.  It includes 
profiles for RDF URI references, IRI and the compromise we use in Jena 
as a balance of legacy and spec exactness.

There is an online test service for RDF data in non-RDF/XML formats at:

http://sparql.org/data-validator.html

The IRIs are checked with the IRI library.

	Andy

A few examples:

http://example/a b

Code: 17/WHITESPACE in PATH: A single whitespace character. These match 
no grammar rules of URIs/IRIs. These characters are permitted in RDF URI 
References, XML system identifiers, and XML Schema anyURIs.

http://example/a[]b

Code: 0/ILLEGAL_CHARACTER in PATH: The character violates the grammar 
rules for URIs/IRIs.

http://example:80/

Code: 13/DEFAULT_PORT_SHOULD_BE_OMITTED in PORT: If the port is the 
default one for the scheme it should be omitted.
<http://example:80/> Code: 14/PORT_SHOULD_NOT_BE_WELL_KNOWN in PORT: 
Ports under 1024 should be accessed using the appropriate scheme name

urn:xyz

Code: 61/SCHEME_PATTERN_MATCH_FAILED in PATH: The scheme specific syntax 
rules are violated.

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Mime
View raw message