commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "lbruun (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (CSV-230) Support for csvw format
Date Thu, 16 Aug 2018 16:54:00 GMT

    [ https://issues.apache.org/jira/browse/CSV-230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16582822#comment-16582822
] 

lbruun edited comment on CSV-230 at 8/16/18 4:53 PM:
-----------------------------------------------------

{quote} [This|https://www.w3.org/TR/tabular-data-model/#formats-for-dates-and-times] part
is complete non-sense because it is impossible to make that kind of parsing unambiguous.
{quote}

I think you've misread the spec. The CSVW spec defines the XML Schema format as the default.
If that is not the case then an explicit format string must be supplied in the meta data.
At least that is how I read it.

But there are certainly many places where I would have liked the authors of the CSVW standard
to narrow down choice even further. Why not simply only allow for dates and times in XML Schema
format?  (for example).

I've come to learn that there's a competing standard known as [Tabular Data Package|http://frictionlessdata.io/specs/tabular-data-package/]
which has been developed under the hospice of [Frictionless Data initiative|http://frictionlessdata.io/].
Ideally Apache Commons CSV would support both this and CSVW standard, but it is too early
to tell which will prevail.


was (Author: lbruun):
{quote} [This|https://www.w3.org/TR/tabular-data-model/#formats-for-dates-and-times] part
is complete non-sense because it is impossible to make that kind of parsing unambiguous.
{quote}
I think you've misread the spec. The CSVW spec defines the XML Schema format as the default.
If that is not the case then an explicit format string must be supplied in the meta data.
At least that is how I read it.

But there are certainly many places where I would have liked the authors of the CSVW standard
to narrow down choice even further. Why not simply only allow for dates and times in XML Schema
format?  (for example).

I've come to learn that there's a competing standard known as [Tabular Data Package|http://frictionlessdata.io/specs/tabular-data-package/]
which has been developed under the hospice of [Frictionless Data initiative|http://frictionlessdata.io/].
Ideally Apache Commons CSV would support both this and CSVW standard, but it is too early
to tell which will prevail.

> Support for csvw format
> -----------------------
>
>                 Key: CSV-230
>                 URL: https://issues.apache.org/jira/browse/CSV-230
>             Project: Commons CSV
>          Issue Type: Wish
>          Components: Parser
>            Reporter: lbruun
>            Priority: Major
>
> Since the dawn of days usage of the CSV format has been plagued by the lack of standardization.
Sure we've had RFC4180 but it stops short of defining many things that allows a consumer to
correctly interpret a CSV file.
> The [csvw|https://www.w3.org/TR/tabular-data-model/] is a fairly new standard from W3C
which aims to fix all of this. It defines the format of a (basically the RFC4180) but in addition
defines how metadata for a CSV file should be conveyed.
> The csvw standard is a completed standard. It is not work-in-progress.
> Don't be fooled by its name: "CSV on the web" it applies equally so in the system-to-system
space.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message