cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stu Hood (JIRA)" <j...@apache.org>
Subject [jira] Updated: (CASSANDRA-1434) ColumnFamilyOutputFormat performs blocking writes for large batches
Date Wed, 08 Sep 2010 00:17:32 GMT

     [ https://issues.apache.org/jira/browse/CASSANDRA-1434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Stu Hood updated CASSANDRA-1434:
--------------------------------

    Attachment: 0001-Switch-away-from-Multimap-and-fix-regression-introdu.patch
                0002-Improve-concurrency-and-add-basic-retries-by-attempt.patch

0001 is changes to the RingCache that survived from v1: it fixes the bug in ringcache that
was handled by pre-0004, and removes the multimap.

0002 is a completely revamped ColumnFamilyRecordWriter: nothing from the original patch survived.
* Launches a client thread per unique range, which is responsible for communicating with endpoint
replicas for that range.
** The client threads receives mutations for the range from the parent thread on a bounded
queue.
** Client threads will attempt to send a full batch of mutations to its replicas in order:
this means that each batch gets up to RF retries before failing, but without any failures,
connections will always be made to the first replica.
* The parent thread loops trying to offer to queues for client threads, and checks that they
are still alive (and fails if they aren't).
* For a N node cluster, up to (2 * N * batchSize) mutations will be in memory at once, so
the default batchSize was lowered to 4096.

Fairly well tested against a 12 node cluster: no obvious races or bottlenecks.

> ColumnFamilyOutputFormat performs blocking writes for large batches
> -------------------------------------------------------------------
>
>                 Key: CASSANDRA-1434
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1434
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>            Reporter: Stu Hood
>            Assignee: Stu Hood
>             Fix For: 0.7 beta 2
>
>         Attachments: 0001-Switch-away-from-Multimap-and-fix-regression-introdu.patch,
0002-Improve-concurrency-and-add-basic-retries-by-attempt.patch
>
>
> By default, ColumnFamilyOutputFormat batches {{mapreduce.output.columnfamilyoutputformat.batch.threshold}}
or {{Long.MAX_VALUE}} mutations, and then performs a blocking write.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message