cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Paulo Motta (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-9303) Match cassandra-loader options in COPY FROM
Date Wed, 16 Dec 2015 18:51:46 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-9303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15060520#comment-15060520
] 

Paulo Motta commented on CASSANDRA-9303:
----------------------------------------

Looking good, thanks! Some follow-ups below:

bq. CONFIGSECTIONS: this is removed and instead we search the following static sections: \[copy\],
\[copy-ks-table\], \[copy-ks-table-from\] or \[copy-ks-table-to\], in this order.

sounds good! I'd suggest the following \[copy(:ks.table)\] (global and per-table copy (to
and from) options), \[copy-from(:ks.table)\] (global and per-table copy-from options), \[copy-to(:ks.table)\]
(global and per-table copy-to options) where (:ks.table) is optional. so you can have \[copy\],
\[copy-to\], \[copy-from\], \[copy-to:ks.table\], \[copy-from:ks.table\].

bq. if no error file is specified I've introduced a default error file called import_ks_table.err

nice! maybe we could just add an unique suffix to avoid appending to an existing file from
a previous execution?

bq. Another thing that follows from the CASSANDRA-9302 review is that the INGESTRATE only
works if it is much bigger than the CHUNKSIZE. We could address it here if you think this
is important. 

We can address if it won't take too much time, otherwise we can address it separately. Can
we maybe improve it by making batchsize adaptive = {{min(batchsize, ingest_rate - current_record)}}
or something more complicated will be needed?

Some minor things I missed before:

* Move {{SKIPCOLS}} to {{COPY_COMMON_OPTIONS}} since it can be used in both copy-to and copy-from.
* Regarding the beahvior of {{SKIPCOLS}} with COPY FROM, right now it only supports having
fewer columns in the CSV. Should we also support actually skipping columns in the CSV even
if they are present?
** Another related feature to have in the future would be to pick only specific columnms from
the csv and allowing custom orderings of columns, but we can leave that for later if there's
a need.

After those are addressed you can probably start making 2.2+ patches.

> Match cassandra-loader options in COPY FROM
> -------------------------------------------
>
>                 Key: CASSANDRA-9303
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9303
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Tools
>            Reporter: Jonathan Ellis
>            Assignee: Stefania
>            Priority: Critical
>             Fix For: 2.1.x
>
>
> https://github.com/brianmhess/cassandra-loader added a bunch of options to handle real
world requirements, we should match those.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message