cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Piotr Kołaczkowski (JIRA) <>
Subject [jira] [Commented] (CASSANDRA-6927) Create a CQL3 based bulk OutputFormat
Date Tue, 08 Jul 2014 10:02:04 GMT


Piotr Kołaczkowski commented on CASSANDRA-6927:

CqlOutputFormat, L31, L33:
{noformat}OutputFormat that allows reduce tasks insert the binded variable values{noformat}
binded -> bound

They duplicate quite a lot of code found in org.apache.cassandra.hadoop.ConfigHelper#getClientFromOutputAddressList.
Additionally, the ExternalClient code hardcodes a reference to FramedTransport and doesn't
use the configured ITransportFactory (extension point used by e.g. DSE to plugin kerberos
authentication). I know this code was there before (it was moved from BulkRecordWriter), but
maybe this is a good occasion to clean it up a little?

org.apache.cassandra.hadoop.cql3.CqlBulkRecordWriter#prepareWriter, L88:
Unchecked result of mkdirs.

getColumnFamilySchema result is not checked; might return null and cause NPE instead of descriptive

org.apache.cassandra.hadoop.cql3.CqlBulkRecordWriter#getColumnFamilyInsertStatement - as above,
needs proper null check; 

I think those two should be checked in checkOutputSpecs at the level of CqlBulkOutputFormat
(needs override); better fail early.

org.apache.cassandra.hadoop.cql3.CqlBulkRecordWriter#write doesn't invoke Hadoop progress
method, as BulkRecordWriter does:
if (null != progress)
if (null != context)

      throw new IOException("Error adding row", e);
Maybe logging the key in the error message would be useful here? 

> Create a CQL3 based bulk OutputFormat
> -------------------------------------
>                 Key: CASSANDRA-6927
>                 URL:
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Hadoop
>            Reporter: Paul Pak
>            Priority: Minor
>              Labels: cql3, hadoop
>         Attachments: 6927-2.0-branch-v2.txt, trunk-6927.txt
> This is the CQL compatible version of BulkOutputFormat.  CqlOutputFormat exists, but
doesn't write SSTables directly, similar to ColumnFamilyOutputFormat for thrift.

This message was sent by Atlassian JIRA

View raw message