Mailing-List: contact issues-help@commons.apache.org; run by ezmlm
Precedence: bulk
Reply-To: issues@commons.apache.org
Date: Mon, 17 Mar 2014 07:58:43 +0000 (UTC)
From: "Emmanuel Bourg (JIRA)" <jira@apache.org>
To: issues@commons.apache.org
Message-ID: <JIRA.12701765.1394997217651.88171.1395043123823@arcas>
In-Reply-To: <JIRA.12701765.1394997217651@arcas>
References: <JIRA.12701765.1394997217651@arcas>
Subject: [jira] [Comment Edited] (CSV-107) CSVFormat.EXCEL.parse should
 handle byte order marks
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


    [ https://issues.apache.org/jira/browse/CSV-107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13937528#comment-13937528 ] 

Emmanuel Bourg edited comment on CSV-107 at 3/17/14 7:57 AM:
-------------------------------------------------------------

A byte order mark is not a character, but a byte sequence at the beginning of the binary stream. {{CSVFormat.parse()}} works on a Reader which is a character stream, it's too late to analyze the BOM at this point.


was (Author: ebourg):
A byte order mark is not a character, but a byte sequence at the beginning of the binary stream. CSVFormat works on a Reader which is a character stream, it's too late to analyze the BOM at this point.

> CSVFormat.EXCEL.parse should handle byte order marks
> ----------------------------------------------------
>
>                 Key: CSV-107
>                 URL: https://issues.apache.org/jira/browse/CSV-107
>             Project: Commons CSV
>          Issue Type: Bug
>          Components: Parser
>    Affects Versions: 1.0
>            Reporter: Kenzley Alphonse
>            Priority: Critical
>         Attachments: vod.csv
>
>   Original Estimate: 3h
>  Remaining Estimate: 3h
>
> The CSVFormat.EXCEL.parse should consider the byte order marks when reading the input stream. Files with a byte order mark fail to properly parse.
> In my example, I have a starting byte order mark before my headers in a CVS file. The parse fails when trying to get the header via the CSVRecord.get call.
> I marked this as critical because many users will interact with Windows user which will most likely have BOM files.


--
This message was sent by Atlassian JIRA
(v6.2#6252)