Return-Path: X-Original-To: apmail-incubator-any23-dev-archive@minotaur.apache.org Delivered-To: apmail-incubator-any23-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 4D57195FE for ; Tue, 22 May 2012 18:39:43 +0000 (UTC) Received: (qmail 37863 invoked by uid 500); 22 May 2012 18:39:42 -0000 Delivered-To: apmail-incubator-any23-dev-archive@incubator.apache.org Received: (qmail 37360 invoked by uid 500); 22 May 2012 18:39:42 -0000 Mailing-List: contact any23-dev-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: any23-dev@incubator.apache.org Delivered-To: mailing list any23-dev@incubator.apache.org Received: (qmail 37166 invoked by uid 99); 22 May 2012 18:39:42 -0000 Received: from issues-vm.apache.org (HELO issues-vm) (140.211.11.160) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 22 May 2012 18:39:42 +0000 Received: from isssues-vm.apache.org (localhost [127.0.0.1]) by issues-vm (Postfix) with ESMTP id 1323A142826 for ; Tue, 22 May 2012 18:39:42 +0000 (UTC) Date: Tue, 22 May 2012 18:39:42 +0000 (UTC) From: "Andy Seaborne (JIRA)" To: any23-dev@incubator.apache.org Message-ID: <49378876.9004.1337711982081.JavaMail.jiratomcat@issues-vm> In-Reply-To: <266752649.6039.1337647540798.JavaMail.jiratomcat@issues-vm> Subject: [jira] [Commented] (ANY23-99) NQuadsWriter should force ASCII in OutputStream constructor MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/ANY23-99?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13281138#comment-13281138 ] Andy Seaborne commented on ANY23-99: ------------------------------------ Is there a specific example of this happening? The encoding rules for NQuads are to use \u so something has to encode to ASCII and it is not enough to rely the writer doing chars to bytes. I think this is handled via the calls: Literals: org.openrdf.rio.ntriples.NTriplesUtil.toNTriplesString URIs: org.openrdf.rio.ntriples.NTriplesUtil.escapeString Comments handleComment does not encode - this is (arguably) not quite right. Also: The charset requirements may well change. The soon-to-be-published working draft of the formal spec for N-triples defines it to be UTF-8 when used with application/n-triples. The old rules for text/plain still apply (US-ASCII). I would expect N-Quads to follow N-triples. This is all in the future. > NQuadsWriter should force ASCII in OutputStream constructor > ----------------------------------------------------------- > > Key: ANY23-99 > URL: https://issues.apache.org/jira/browse/ANY23-99 > Project: Apache Any23 > Issue Type: Bug > Components: core > Affects Versions: 0.8.0 > Reporter: Peter Ansell > > The NQuads specification states that all NQuads documents must be ASCII encoded. [1] The current NQuadsWriter(OutputStream) constructor does not enforce this when creating the OutputStreamWriter to wrap up the given outputstream. If it is not enforced, then the users locale will be used to create the OutputStreamWriter, which may not enforce US-ASCII. > Patch is to replace the constructor with: > this( new OutputStreamWriter(os, Charset.forName("US-ASCII")) ); > [1] http://sw.deri.org/2008/07/n-quads/#mediatype -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira