commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <>
Subject [jira] [Commented] (COMMONSRDF-51) RDF-1.1 specifies that language tags need to be compared using lower-case
Date Mon, 16 Jan 2017 17:49:26 GMT


ASF GitHub Bot commented on COMMONSRDF-51:

Github user stain commented on the issue:
    This pull request returns `getLanguageTag()` in whatever case the underlying platform
does (e.g. I think RDF4J and JSONLD-Java preserves casing, while Jena and Simple converts
to lowercase.
    I think it is only in `.equals()` and `.hashCode()` we need case insensitivity.
    There's arguments both ways if we should provide a consistent view across the implementations
(e.g. always lowercase); or if we should provide a consistency with what the underlying implementation
does (e.g. if it is preserves casing for presentation purposes). 
    Commons RDF don't have any value handling mechanisms now for say converting`"13.37"^^xsd:float`
to a Java float `13.37f` (without going through the underlying implementations and related
methods); or determining value equality, so I think it is not too weird if  Commons RDF doesn't
do anything clever about language tags either (beyond spec  compliance).
    But if someone were to add a Common RDF API for such literal value handling, it could
be natural to also add "utils" methods for presenting or parsing language tags (e.g. `isLanguageTagEqual("en-us",
"en-US")` as well as hierarchical comparisons, something like `isSameLanguageTagFamily("en-us",

> RDF-1.1 specifies that language tags need to be compared using lower-case
> -------------------------------------------------------------------------
>                 Key: COMMONSRDF-51
>                 URL:
>             Project: Apache Commons RDF
>          Issue Type: Bug
>          Components: api
>    Affects Versions: 0.3.0
>            Reporter: Peter Ansell
>            Assignee: Stian Soiland-Reyes
> The [RDF-1.1 specification states that the [value space of Literal language tags is lowercase|],
which does not conflict with the case-insensitive specification in BCP47. The Literal.equals
and Literal.hashCode API contracts should specify that language tags must be compared using
lowercase, even if they are otherwise stored and returned as upper-case by getLanguageTag.
The API currently has incorrect language by saying "character-by-character" for language tag
comparisons, as that implies case-sensitive comparisons are used.
> The lowercasing must also be done using a locale that is consistent (known example where
lowercase and uppercase do not roundtrip as expected for US-ASCII characters is Turkish [1]),
so I would recommend actually stating that .toLowerCase(Locale.ENGLISH) is used.

This message was sent by Atlassian JIRA

View raw message