Return-Path: X-Original-To: apmail-cassandra-commits-archive@www.apache.org Delivered-To: apmail-cassandra-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id E25A11112E for ; Wed, 9 Jul 2014 18:23:05 +0000 (UTC) Received: (qmail 44856 invoked by uid 500); 9 Jul 2014 18:23:05 -0000 Delivered-To: apmail-cassandra-commits-archive@cassandra.apache.org Received: (qmail 44750 invoked by uid 500); 9 Jul 2014 18:23:05 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 44576 invoked by uid 99); 9 Jul 2014 18:23:05 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 09 Jul 2014 18:23:05 +0000 Date: Wed, 9 Jul 2014 18:23:05 +0000 (UTC) From: "Paul Pak (JIRA)" To: commits@cassandra.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Comment Edited] (CASSANDRA-6927) Create a CQL3 based bulk OutputFormat MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/CASSANDRA-6927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14056557#comment-14056557 ] Paul Pak edited comment on CASSANDRA-6927 at 7/9/14 6:22 PM: ------------------------------------------------------------- [~alexliu68] Hi Alex, thanks for your input. The fact that Hadoop properties aren't naturally specific to a column family is precisely the reason for not having generic schema/insertStatement properties and expecting them to apply to a particular column family, even if you happen to be working with only one column family. If some property value only applies to a specific column family, why not indicate it as such in the property key? It's certainly clearer and safer. Also, what would be the benefit of having overloaded set/getColumnFamily* methods? They require additional validations to ensure the proper ones were used for the appropriate scenario, as opposed to having unambiguous ones that don't require any validation and work in all cases. The only possible benefit I can see is if there was a case where a column family was either unknown or not applicable, but that will never be the case with these schema/insertStatement properties. In general, though, I prefer an approach where one solution works in all scenarios over one that entails variations of settings/methods that apply differently in different scenarios. It adds unnecessary complexity without any benefits and is prone to user confusion, misuse, and error. was (Author: sixpak32577): [~alexliu68] Hi Alex, thanks for your input. The fact that Hadoop properties aren't naturally specific to a column family is precisely the reason for not having generic schema/insertStatement properties and expecting them to apply to a particular column family, even if you happen to be working with only one column family. If some property value only applies to a specific column family, why not indicate it as such in the property key? It's certainly clearer and safer. Also, what would be the benefit of having overloaded set/getColumnFamily* methods? They require additional validations to ensure the proper ones were used for the appropriate scenario, as opposed to having unambiguous ones that don't require any validation and work in all cases. The only possible benefit I can see is if there was a case where a column family was either unknown or not applicable, but that will never be the case with these schema/insertStatements properties. In general, I prefer an approach where one solution works in all scenarios over one that entails variations of settings/methods that apply differently in different scenarios. It's adds unnecessary complexity without any benefits and is prone to user confusion, misuse, and error. > Create a CQL3 based bulk OutputFormat > ------------------------------------- > > Key: CASSANDRA-6927 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6927 > Project: Cassandra > Issue Type: New Feature > Components: Hadoop > Reporter: Paul Pak > Priority: Minor > Labels: cql3, hadoop > Attachments: 6927-2.0-branch-v2.txt, trunk-6927-v3.txt, trunk-6927.txt > > > This is the CQL compatible version of BulkOutputFormat. CqlOutputFormat exists, but doesn't write SSTables directly, similar to ColumnFamilyOutputFormat for thrift. -- This message was sent by Atlassian JIRA (v6.2#6252)