Return-Path: X-Original-To: apmail-commons-issues-archive@minotaur.apache.org Delivered-To: apmail-commons-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 7922D10957 for ; Mon, 17 Mar 2014 07:58:55 +0000 (UTC) Received: (qmail 97234 invoked by uid 500); 17 Mar 2014 07:58:51 -0000 Delivered-To: apmail-commons-issues-archive@commons.apache.org Received: (qmail 95993 invoked by uid 500); 17 Mar 2014 07:58:45 -0000 Mailing-List: contact issues-help@commons.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: issues@commons.apache.org Delivered-To: mailing list issues@commons.apache.org Received: (qmail 95964 invoked by uid 99); 17 Mar 2014 07:58:44 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 17 Mar 2014 07:58:44 +0000 Date: Mon, 17 Mar 2014 07:58:43 +0000 (UTC) From: "Emmanuel Bourg (JIRA)" To: issues@commons.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Comment Edited] (CSV-107) CSVFormat.EXCEL.parse should handle byte order marks MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/CSV-107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13937528#comment-13937528 ] Emmanuel Bourg edited comment on CSV-107 at 3/17/14 7:57 AM: ------------------------------------------------------------- A byte order mark is not a character, but a byte sequence at the beginning of the binary stream. {{CSVFormat.parse()}} works on a Reader which is a character stream, it's too late to analyze the BOM at this point. was (Author: ebourg): A byte order mark is not a character, but a byte sequence at the beginning of the binary stream. CSVFormat works on a Reader which is a character stream, it's too late to analyze the BOM at this point. > CSVFormat.EXCEL.parse should handle byte order marks > ---------------------------------------------------- > > Key: CSV-107 > URL: https://issues.apache.org/jira/browse/CSV-107 > Project: Commons CSV > Issue Type: Bug > Components: Parser > Affects Versions: 1.0 > Reporter: Kenzley Alphonse > Priority: Critical > Attachments: vod.csv > > Original Estimate: 3h > Remaining Estimate: 3h > > The CSVFormat.EXCEL.parse should consider the byte order marks when reading the input stream. Files with a byte order mark fail to properly parse. > In my example, I have a starting byte order mark before my headers in a CVS file. The parse fails when trying to get the header via the CSVRecord.get call. > I marked this as critical because many users will interact with Windows user which will most likely have BOM files. -- This message was sent by Atlassian JIRA (v6.2#6252)