cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Brandon Williams (Commented) (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-3045) Update ColumnFamilyOutputFormat to use new bulkload API
Date Mon, 28 Nov 2011 18:53:40 GMT


Brandon Williams commented on CASSANDRA-3045:

bq. How do you configure BOF vs CFOF?

By calling setOutputFormatClass on the job.

bq. Why do we need to keep CFOF around?

I can think of two reasons: firstly, by removing it, we break every existing job.  This is
pretty easy for users to fix though, as indicated above.  Secondly, someone might want access
to each individual record as soon as possible, rather than waiting for the entire job to finish
and stream a bunch of sstables.  It's a latency vs throughput tradeoff.
> Update ColumnFamilyOutputFormat to use new bulkload API
> -------------------------------------------------------
>                 Key: CASSANDRA-3045
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Hadoop
>            Reporter: Jonathan Ellis
>            Assignee: Brandon Williams
>            Priority: Minor
>             Fix For: 1.1
>         Attachments: 0001-Remove-gossip-SS-requirement-from-BulkLoader.txt, 0002-Allow-DD-loading-without-yaml.txt,
> The bulk loading interface added in CASSANDRA-1278 is a great fit for Hadoop jobs.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:!default.jspa
For more information on JIRA, see:


View raw message