commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (COMMONSRDF-51) RDF-1.1 specifies that language tags need to be compared using lower-case
Date Sat, 21 Jan 2017 21:40:26 GMT

    [ https://issues.apache.org/jira/browse/COMMONSRDF-51?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15833155#comment-15833155
] 

ASF GitHub Bot commented on COMMONSRDF-51:
------------------------------------------

Github user afs commented on the issue:

    https://github.com/apache/commons-rdf/pull/30
  
    RDF 1.1 mentions:
    
    1. Turtle parsing - there is a lang tag rule.
    2. The text that conversion to a lowercase lexical is allowed.
    3. Value-comparison is case insensitive.
    
    Which is that test for? Lexical or value?
    
    At least acknowledging that RDF's "lowercase" is not in keeping with BCP 47 syntax canonicalization
(the registry may change the characters) whatever the spec makes sense to me and I suspect
domain experts; it's following the spec that "owns" language tags. Focus on the value comparison.



> RDF-1.1 specifies that language tags need to be compared using lower-case
> -------------------------------------------------------------------------
>
>                 Key: COMMONSRDF-51
>                 URL: https://issues.apache.org/jira/browse/COMMONSRDF-51
>             Project: Apache Commons RDF
>          Issue Type: Bug
>          Components: api
>    Affects Versions: 0.3.0
>            Reporter: Peter Ansell
>            Assignee: Stian Soiland-Reyes
>
> The [RDF-1.1 specification states that the [value space of Literal language tags is lowercase|https://www.w3.org/TR/rdf11-concepts/#section-Graph-Literal],
which does not conflict with the case-insensitive specification in BCP47. The Literal.equals
and Literal.hashCode API contracts should specify that language tags must be compared using
lowercase, even if they are otherwise stored and returned as upper-case by getLanguageTag.
The API currently has incorrect language by saying "character-by-character" for language tag
comparisons, as that implies case-sensitive comparisons are used.
> The lowercasing must also be done using a locale that is consistent (known example where
lowercase and uppercase do not roundtrip as expected for US-ASCII characters is Turkish [1]),
so I would recommend actually stating that .toLowerCase(Locale.ENGLISH) is used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message