Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 4C43E10266 for ; Thu, 29 Aug 2013 19:04:40 +0000 (UTC) Received: (qmail 15812 invoked by uid 500); 29 Aug 2013 19:04:37 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 15564 invoked by uid 500); 29 Aug 2013 19:04:37 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 15550 invoked by uid 99); 29 Aug 2013 19:04:36 -0000 Received: from minotaur.apache.org (HELO minotaur.apache.org) (140.211.11.9) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 29 Aug 2013 19:04:36 +0000 Received: from localhost (HELO mail-we0-f176.google.com) (127.0.0.1) (smtp-auth username lhazlewood, mechanism plain) by minotaur.apache.org (qpsmtpd/0.29) with ESMTP; Thu, 29 Aug 2013 19:04:35 +0000 Received: by mail-we0-f176.google.com with SMTP id q56so801520wes.35 for ; Thu, 29 Aug 2013 12:04:34 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:date:message-id:subject:from:to :content-type; bh=I11bVt/k77uh4OfvfMrXzJGGlgS/48ub1/ioOw0Q/zM=; b=EseZ47x9nMZQbuBlKFcTYBGFPrxI2UNUjoN4KQ5Y6nTKtHhmOgnyIk70Y0xp3YBsuG dILOIzeV0UCWog7GW8eVtRNfVb1rSHX/Vf7zcTiHtjTFNwPguPUVjesl/nMM1aE73IW2 Dx4EMH5X4aePrlVO7Y6tbUeM7MA20wxiyLE0m8q7jw8+XAE9HPCjkMZKNtz5+4LPkmti rpuM4RHDI2su5/15S/6muiQ1lhEmuMLzHFrV1r8JGgRIXsqxn7N8d6oOV8CG3DpoJgui DhyINKsAwrcXtEQvBjjkwjRrCPKpdyNXDKnzpPDCCWJNpTQ3/9hN0lgkDGOsLkYgb5oc fcmg== X-Gm-Message-State: ALoCoQmNSUAuvv23L5lEANsJyD8QqJx/u9PWcGKwOPZIrAxoTpdOoXkwcshPqWkkKqzeB4ya6C+X MIME-Version: 1.0 X-Received: by 10.180.37.199 with SMTP id a7mr1256417wik.43.1377803074019; Thu, 29 Aug 2013 12:04:34 -0700 (PDT) Received: by 10.216.166.135 with HTTP; Thu, 29 Aug 2013 12:04:33 -0700 (PDT) Date: Thu, 29 Aug 2013 12:04:33 -0700 Message-ID: Subject: CQL3 wide row and slow inserts - is there a single insert alternative? From: Les Hazlewood To: user@cassandra.apache.org Content-Type: text/plain; charset=ISO-8859-1 Hi all, We're using a Cassandra table to store search results in a table/column family that that look like this: +--------+---------+---------+---------+---- | | 0 | 1 | 2 | ... +--------+---------+---------+---------+---- | row_id | text... | text... | text... | ... The column name is the index # (an integer) of the location in the overall result set. The value is the result at that particular index. This is great because pagination becomes a simple slice query on the column name. Large result sets are split into multiple rows - we're limiting row size on disk to be around 6 or 7 MB. For our particular result entries, this means we can get around 50,000 columns in a single row. When we create the rows, we have the entire data available in the application at the time the row insert is necessary. Using CQL3, an initial implementation had one INSERT statement per column. This was killing performance (not to mention the # of tombstones it created). Here's the CQL3 table definition: create table query_results ( row_id text, shard_num int, list_index int, result text, primary key (row_id, shard_num), list_index)) with compact storage (the row key is row_id + shard_num. The 'cluster column' is list_index). I don't want to execute 50,000 INSERT statements for a single row. We have all of the data up front - I want to execute a single INSERT. Is this possible? We're using the Datastax Java Driver. Thanks for any help! Les