Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 30ECE200BFE for ; Mon, 16 Jan 2017 23:15:35 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 2F55B160B41; Mon, 16 Jan 2017 22:15:35 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 784C8160B28 for ; Mon, 16 Jan 2017 23:15:34 +0100 (CET) Received: (qmail 40943 invoked by uid 500); 16 Jan 2017 22:15:33 -0000 Mailing-List: contact issues-help@commons.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: issues@commons.apache.org Delivered-To: mailing list issues@commons.apache.org Received: (qmail 40932 invoked by uid 99); 16 Jan 2017 22:15:33 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 16 Jan 2017 22:15:33 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id F3EF6180BB7 for ; Mon, 16 Jan 2017 22:15:32 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -1.198 X-Spam-Level: X-Spam-Status: No, score=-1.198 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, KAM_LAZY_DOMAIN_SECURITY=1, RP_MATCHES_RCVD=-2.999, URIBL_BLOCKED=0.001] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id Te5XBVSuaJyg for ; Mon, 16 Jan 2017 22:15:32 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id D006A5FB33 for ; Mon, 16 Jan 2017 22:15:31 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 2BB0AE865F for ; Mon, 16 Jan 2017 22:15:28 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 8A8A325288 for ; Mon, 16 Jan 2017 22:15:26 +0000 (UTC) Date: Mon, 16 Jan 2017 22:15:26 +0000 (UTC) From: "ASF GitHub Bot (JIRA)" To: issues@commons.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (COMMONSRDF-51) RDF-1.1 specifies that language tags need to be compared using lower-case MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Mon, 16 Jan 2017 22:15:35 -0000 [ https://issues.apache.org/jira/browse/COMMONSRDF-51?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15824660#comment-15824660 ] ASF GitHub Bot commented on COMMONSRDF-51: ------------------------------------------ Github user afs commented on the issue: https://github.com/apache/commons-rdf/pull/30 @ansell mentions one of the reasons the wording for RDF 1.1is not so direct - RDF 1.0 did not sanction the common normalization defined in BCP47 canonicalization, although that actually requires consulting the registry as well. Jena is lax by default, and retains the form as originally written. In practice, datasets seem to be internally consistent, all lower case or all syntax-canonical. Variations of case are different nodes in the general case but are `Node.sameValue` (compare) and cause matching in graph.find. Some storage layers may differ and canonicalize the form, in order to index. > RDF-1.1 specifies that language tags need to be compared using lower-case > ------------------------------------------------------------------------- > > Key: COMMONSRDF-51 > URL: https://issues.apache.org/jira/browse/COMMONSRDF-51 > Project: Apache Commons RDF > Issue Type: Bug > Components: api > Affects Versions: 0.3.0 > Reporter: Peter Ansell > Assignee: Stian Soiland-Reyes > > The [RDF-1.1 specification states that the [value space of Literal language tags is lowercase|https://www.w3.org/TR/rdf11-concepts/#section-Graph-Literal], which does not conflict with the case-insensitive specification in BCP47. The Literal.equals and Literal.hashCode API contracts should specify that language tags must be compared using lowercase, even if they are otherwise stored and returned as upper-case by getLanguageTag. The API currently has incorrect language by saying "character-by-character" for language tag comparisons, as that implies case-sensitive comparisons are used. > The lowercasing must also be done using a locale that is consistent (known example where lowercase and uppercase do not roundtrip as expected for US-ASCII characters is Turkish [1]), so I would recommend actually stating that .toLowerCase(Locale.ENGLISH) is used. -- This message was sent by Atlassian JIRA (v6.3.4#6332)