cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Piotr Kołaczkowski (JIRA) <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-6927) Create a CQL3 based bulk OutputFormat
Date Tue, 08 Jul 2014 10:02:04 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-6927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14054786#comment-14054786
] 

Piotr Kołaczkowski commented on CASSANDRA-6927:
-----------------------------------------------

CqlOutputFormat, L31, L33:
{noformat}OutputFormat that allows reduce tasks insert the binded variable values{noformat}
binded -> bound

org.apache.cassandra.hadoop.AbstractBulkRecordWriter.ExternalClient#init,
org.apache.cassandra.hadoop.AbstractBulkRecordWriter.ExternalClient#createThriftClient:
They duplicate quite a lot of code found in org.apache.cassandra.hadoop.ConfigHelper#getClientFromOutputAddressList.
Additionally, the ExternalClient code hardcodes a reference to FramedTransport and doesn't
use the configured ITransportFactory (extension point used by e.g. DSE to plugin kerberos
authentication). I know this code was there before (it was moved from BulkRecordWriter), but
maybe this is a good occasion to clean it up a little?

org.apache.cassandra.hadoop.cql3.CqlBulkRecordWriter#prepareWriter, L88:
Unchecked result of mkdirs.

org/apache/cassandra/hadoop/cql3/CqlBulkRecordWriter.java:89:
getColumnFamilySchema result is not checked; might return null and cause NPE instead of descriptive
message.

org/apache/cassandra/hadoop/cql3/CqlBulkRecordWriter.java:96
org.apache.cassandra.hadoop.cql3.CqlBulkRecordWriter#getColumnFamilyInsertStatement - as above,
needs proper null check; 

I think those two should be checked in checkOutputSpecs at the level of CqlBulkOutputFormat
(needs override); better fail early.

org.apache.cassandra.hadoop.cql3.CqlBulkRecordWriter#write doesn't invoke Hadoop progress
method, as BulkRecordWriter does:
{noformat}
if (null != progress)
   progress.progress();
if (null != context)
   HadoopCompat.progress(context);
{noformat}

org/apache/cassandra/hadoop/cql3/CqlBulkRecordWriter.java:144
{noformat}
      throw new IOException("Error adding row", e);
{noformat}
Maybe logging the key in the error message would be useful here? 






> Create a CQL3 based bulk OutputFormat
> -------------------------------------
>
>                 Key: CASSANDRA-6927
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6927
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Hadoop
>            Reporter: Paul Pak
>            Priority: Minor
>              Labels: cql3, hadoop
>         Attachments: 6927-2.0-branch-v2.txt, trunk-6927.txt
>
>
> This is the CQL compatible version of BulkOutputFormat.  CqlOutputFormat exists, but
doesn't write SSTables directly, similar to ColumnFamilyOutputFormat for thrift.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message