incubator-jena-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andy Seaborne <>
Subject Fwd: RDF WG Resolution Regarding Various Forms of String Literals
Date Thu, 16 Jun 2011 09:14:14 GMT
> -------- Original Message --------
> Subject: RDF WG Resolution Regarding Various Forms of String Literals
> Resent-Date: Wed, 15 Jun 2011 16:40:08 +0000
> Resent-From:
> Date: Wed, 15 Jun 2011 12:39:19 -0400
> From: David Wood <>
> To: Manu Sporny <>, Ian Horrocks <>,
Lee Feigenbaum <>
> CC: RDF Working Group WG <>, W3C SW CG Group <>,
> Hi all,
> The RDF working group resolved our ISSUE-12 [1] today, which is
> intended to "reconcile various forms of string literals".
> We resolved to accept the proposal at:
>with the modification that preferred output form (SHOULD) is "foo"
 > not "foo"^^xsd:string in RDF; and we recommend that SPARQL and other 
 > WGs do the same.
> Discussion highlighted several possible areas of concern, which we
> believe the current proposal addresses.  Specifically, it was noted
> that:
> - The forms "foo" and "foo"^^xsd:string are equivalent input
> syntaxes. - The form "foo" is the preferred output syntax. - The WG
> suggests retaining the term "plain literal" in documents to avoid
> unnecessary rework.  Such plain literals would be considered
> semantically equivalent to xsd:strings.
> NB: This resolution makes *no statement* about language-tagged
> literals (e.g. "foo"@en).
> We invite discussion regarding the ramifications of this resolution
> to other working groups and implementors.
> Regards, Dave
> [1]

This resolution by RDF-WG is an attempt to sort out plain literals
without language tag (simple literal in SPARQL terminology) and xsd:strings.

This resolution is subject to it being acceptable to other groups.  Touching
datatypes gets the OWL world concerned.

Only xsd:Strings would exist in the abstract syntax.

:x :p "foo" .
:x :p "foo"^^xsd:String .

is a graph of one triple.

The output should be:
:x :p "foo" .

This is obviously a visible change but, in my experience, unlikely to be
much of a problem because data either has xsd:strings or simple
literals, rarely a mixture and if it is a mixture, not on the same

Jena memory models are already equate simple literals and xsd:strings
but can return 2 for a match on "foo".

Persistent models see these are different and unrelated.

There are various to implement this.  There is no rush - first step is 
to see if the resolution actually sticks and is accepted by other 
groups, then the community at large.

Then either or both of:

1/ Change Node_Literal/Node(Factory). This could even be turn 
xsd:strings to plain literals (the reverse of the resolution) as 
minimising the effect on writers.

2/ Get the parsers and writers to sort it out and leave the graph core

I've already been considering a canonicalization phase (e.g. numbers) 
after parsing, before inserting into a model.  This could be added to 
that pipeline.


Plain literals with language tag are a whole different matter.
One possible outcome is that have a class (not a datatype) of 
rdf:LangTagString or some other sensible name.


View raw message