commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stian Soiland-Reyes (JIRA)" <>
Subject [jira] [Commented] (COMMONSRDF-51) RDF-1.1 specifies that language tags need to be compared using lower-case
Date Thu, 12 Jan 2017 11:45:58 GMT


Stian Soiland-Reyes commented on COMMONSRDF-51:

Got one reply already on [public-rdf-comments|],
from [Richard Cyganiak|]:

RDF 2004 forced the language tag to be lower-cased in the abstract syntax. Implementations
of RDF 2004 often did not do that, but retained the case when storing or transforming RDF,
while still treating @en and @EN as equal. My recollection is that we wanted to change the
language of the spec to make this behaviour legal. Unfortunately it seems the language came
out less clear than it should be. I do not think that there was any intention to make @en
and @EN not equal.

> RDF-1.1 specifies that language tags need to be compared using lower-case
> -------------------------------------------------------------------------
>                 Key: COMMONSRDF-51
>                 URL:
>             Project: Apache Commons RDF
>          Issue Type: Bug
>          Components: api
>    Affects Versions: 0.3.0
>            Reporter: Peter Ansell
>            Assignee: Stian Soiland-Reyes
> The [RDF-1.1 specification states that the [value space of Literal language tags is lowercase|],
which does not conflict with the case-insensitive specification in BCP47. The Literal.equals
and Literal.hashCode API contracts should specify that language tags must be compared using
lowercase, even if they are otherwise stored and returned as upper-case by getLanguageTag.
The API currently has incorrect language by saying "character-by-character" for language tag
comparisons, as that implies case-sensitive comparisons are used.
> The lowercasing must also be done using a locale that is consistent (known example where
lowercase and uppercase do not roundtrip as expected for US-ASCII characters is Turkish [1]),
so I would recommend actually stating that .toLowerCase(Locale.ENGLISH) is used.

This message was sent by Atlassian JIRA

View raw message