cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stefania (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-9552) COPY FROM times out after 110000 inserts
Date Mon, 18 Jan 2016 03:58:39 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-9552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15104132#comment-15104132
] 

Stefania commented on CASSANDRA-9552:
-------------------------------------

It works but it is extremely slow server side, at least in 2.1. Therefore I had to keep both
batch size and ingest rate very low, for example I used {{WITH MAXBATCHSIZE = 2 AND INGESTRATE
= 1000}} with 3 2.1 nodes. {{batch_size_warn_threshold_in_kb}} should also be increased from
5 to 15 to avoid server warnings - each partition is approx 5-6 k.

There is a retry policy that backs off exponentially, it kicks in if the ingest rate is too
high and it tries to resend up to MAXATTEMPTS, by default 5. After this we raise a timeout
error.  To view its messages it is necessary to run cqlsh with {{--debug}}, otherwise the
timeout errors are visible but the retries aren't, we just notice that the progress is very
slow. I note that in addition to the back-off policy, we also retry up to MAXATTEMPTS for
any errors, including timeout errors. So we try as hard as we can to complete in case of timeouts.

What I noticed with 3 nodes running locally is that with 1000 rows / second it works fine,
with 1000-3000 the retry policy kicks in but it completes without raising errors, above 3000
we also start getting timeout errors but we still complete, albeit much more slowly than with
1000 rows / second. Much higher than 3000 however it generates too many timeouts and some
rows are therefore not imported.

> COPY FROM times out after 110000 inserts
> ----------------------------------------
>
>                 Key: CASSANDRA-9552
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9552
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter:  Brian Hess
>              Labels: cqlsh
>             Fix For: 2.1.x
>
>
> I am trying to test out performance of COPY FROM on various schemas.  I have a 100-BIGINT-column
table defined as:
> {code}
> CREATE KEYSPACE test WITH replication = {'class': 'SimpleStrategy', 'replication_factor':
'3'}  AND durable_writes = true;
> CREATE TABLE test.test100 (
>     pkey bigint,    ccol bigint,    col0 bigint,    col1 bigint,    col10 bigint,
>     col11 bigint,    col12 bigint,    col13 bigint,    col14 bigint,    col15 bigint,
>     col16 bigint,    col17 bigint,    col18 bigint,    col19 bigint,    col2 bigint,
>     col20 bigint,    col21 bigint,    col22 bigint,    col23 bigint,    col24 bigint,
>     col25 bigint,    col26 bigint,    col27 bigint,    col28 bigint,    col29 bigint,
>     col3 bigint,    col30 bigint,    col31 bigint,    col32 bigint,    col33 bigint,
>     col34 bigint,    col35 bigint,    col36 bigint,    col37 bigint,    col38 bigint,
>     col39 bigint,    col4 bigint,    col40 bigint,    col41 bigint,    col42 bigint,
>     col43 bigint,    col44 bigint,    col45 bigint,    col46 bigint,    col47 bigint,
>     col48 bigint,    col49 bigint,    col5 bigint,    col50 bigint,    col51 bigint,
>     col52 bigint,    col53 bigint,    col54 bigint,    col55 bigint,    col56 bigint,
>     col57 bigint,    col58 bigint,    col59 bigint,    col6 bigint,    col60 bigint,
>     col61 bigint,    col62 bigint,    col63 bigint,    col64 bigint,    col65 bigint,
>     col66 bigint,    col67 bigint,    col68 bigint,    col69 bigint,    col7 bigint,
>     col70 bigint,    col71 bigint,    col72 bigint,    col73 bigint,    col74 bigint,
>     col75 bigint,    col76 bigint,    col77 bigint,    col78 bigint,    col79 bigint,
>     col8 bigint,    col80 bigint,    col81 bigint,    col82 bigint,    col83 bigint,
>     col84 bigint,    col85 bigint,    col86 bigint,    col87 bigint,    col88 bigint,
>     col89 bigint,    col9 bigint,    col90 bigint,    col91 bigint,    col92 bigint,
>     col93 bigint,    col94 bigint,    col95 bigint,    col96 bigint,    col97 bigint,
>     PRIMARY KEY (pkey, ccol)
> ) WITH CLUSTERING ORDER BY (ccol ASC)
>     AND bloom_filter_fp_chance = 0.01
>     AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
>     AND comment = ''
>     AND compaction = {'min_threshold': '4', 'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy',
'max_threshold': '32'}
>     AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
>     AND dclocal_read_repair_chance = 0.1
>     AND default_time_to_live = 0
>     AND gc_grace_seconds = 864000
>     AND max_index_interval = 2048
>     AND memtable_flush_period_in_ms = 0
>     AND min_index_interval = 128
>     AND read_repair_chance = 0.0
>     AND speculative_retry = '99.0PERCENTILE';
> {code}
> I then try to load the linked file of 120,000 rows of 100 BIGINT columns via:
> {code}
> cqlsh -e "COPY test.test100(pkey,ccol,col0,col1,col2,col3,col4,col5,col6,col7,col8,col9,col10,col11,col12,col13,col14,col15,col16,col17,col18,col19,col20,col21,col22,col23,col24,col25,col26,col27,col28,col29,col30,col31,col32,col33,col34,col35,col36,col37,col38,col39,col40,col41,col42,col43,col44,col45,col46,col47,col48,col49,col50,col51,col52,col53,col54,col55,col56,col57,col58,col59,col60,col61,col62,col63,col64,col65,col66,col67,col68,col69,col70,col71,col72,col73,col74,col75,col76,col77,col78,col79,col80,col81,col82,col83,col84,col85,col86,col87,col88,col89,col90,col91,col92,col93,col94,col95,col96,col97)
FROM 'data120K.csv'"
> {code}
> Data file here: https://drive.google.com/file/d/0B87-Pevy14fuUVcxemFRcFFtRjQ/view?usp=sharing
> After 110000 rows, it errors and hangs:
> {code}
> <stdin>:1:110000 rows; Write: 19848.21 rows/s
> Connection heartbeat failure
> <stdin>:1:Aborting import at record #1196. Previously inserted records are still
present, and some records after that may be present as well.
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message