cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stefania (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-8233) Additional file handling capabilities for COPY FROM
Date Wed, 03 Feb 2016 01:16:40 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-8233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15129529#comment-15129529
] 

Stefania commented on CASSANDRA-8233:
-------------------------------------

Almost everything should be covered:

bq. the ability to skip file errors, write out 'bad' rows to file, skip blank lines, and set
a max error count before a load is terminated.
All added by CASSANDRA-9303 except for skipping blank lines (unless the csv parser handles
it for us)

bq. Allow for columns that have quotes, but strip off the quotes before load.
It should be handled by the csv parser but needs testing

bq. Set the end of record delimiter.
Missing

bq. Be able to ignore file header/other starting line(s).
Added by CASSANDRA-9303 but we may need to review how we define the number of rows to be skipped
for multiple files.

bq. Have date and time format handling abilities (with date/time delimiters)
Added by CASSANDRA-9303

bq. Handle carriage returns in data
It should be handled by the csv parser but needs testing

bq. Skip column in file
Added by CASSANDRA-9303


> Additional file handling capabilities for COPY FROM
> ---------------------------------------------------
>
>                 Key: CASSANDRA-8233
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8233
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Tools
>            Reporter: Robin Schumacher
>            Assignee: Stefania
>            Priority: Minor
>             Fix For: 2.1.x
>
>
> To compete better with other RDBMS-styled loaders, COPY needs to include some additional
file handling capabilities: 
> - the ability to skip file errors, write out 'bad' rows to file, skip blank lines, and
set a max error count before a load is terminated. 
> - Allow for columns that have quotes, but strip off the quotes before load.
> - Set the end of record delimiter. 
> - Be able to ignore file header/other starting line(s). 
> - Have date and time format handling abilities (with date/time delimiters)  
> - Handle carriage returns in data
> - Skip column in file



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message