commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <>
Subject [jira] [Commented] (COMMONSRDF-51) RDF-1.1 specifies that language tags need to be compared using lower-case
Date Sun, 22 Jan 2017 14:38:27 GMT


ASF GitHub Bot commented on COMMONSRDF-51:

Github user stain commented on the issue:
    OK, so then it makes sense for the Commons RDF tests to only care about the
    value being preserved (whatever the case going in or out is upper or
    lower), and that our .equals and .hashCode is based on lowercase in the
    ROOT Locale.
    We don't have equivalent tests if datatyped floats etc preserve their
    specific syntactic value (e.g. "-.0"^^xsd:float) so we should not do that
    for langtags either.
    I'll modify the branch and merge.
    On 21 Jan 2017 9:39 pm, "Andy Seaborne" <> wrote:
    > RDF 1.1 mentions:
    >    1. Turtle parsing - there is a lang tag rule.
    >    2. The text that conversion to a lowercase lexical is allowed.
    >    3. Value-comparison is case insensitive.
    > Which is that test for? Lexical or value?
    > At least acknowledging that RDF's "lowercase" is not in keeping with BCP
    > 47 syntax canonicalization (the registry may change the characters)
    > whatever the spec makes sense to me and I suspect domain experts; it's
    > following the spec that "owns" language tags. Focus on the value comparison.
    > —
    > You are receiving this because you authored the thread.
    > Reply to this email directly, view it on GitHub
    > <>,
    > or mute the thread
    > <>
    > .

> RDF-1.1 specifies that language tags need to be compared using lower-case
> -------------------------------------------------------------------------
>                 Key: COMMONSRDF-51
>                 URL:
>             Project: Apache Commons RDF
>          Issue Type: Bug
>          Components: api
>    Affects Versions: 0.3.0
>            Reporter: Peter Ansell
>            Assignee: Stian Soiland-Reyes
> The [RDF-1.1 specification states that the [value space of Literal language tags is lowercase|],
which does not conflict with the case-insensitive specification in BCP47. The Literal.equals
and Literal.hashCode API contracts should specify that language tags must be compared using
lowercase, even if they are otherwise stored and returned as upper-case by getLanguageTag.
The API currently has incorrect language by saying "character-by-character" for language tag
comparisons, as that implies case-sensitive comparisons are used.
> The lowercasing must also be done using a locale that is consistent (known example where
lowercase and uppercase do not roundtrip as expected for US-ASCII characters is Turkish [1]),
so I would recommend actually stating that .toLowerCase(Locale.ENGLISH) is used.

This message was sent by Atlassian JIRA

View raw message