Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 3174510D84 for ; Tue, 20 Aug 2013 19:02:28 +0000 (UTC) Received: (qmail 26943 invoked by uid 500); 20 Aug 2013 19:02:25 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 26807 invoked by uid 500); 20 Aug 2013 19:02:23 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Delivered-To: moderator for user@hbase.apache.org Received: (qmail 18745 invoked by uid 99); 20 Aug 2013 18:57:13 -0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of michael_segel@hotmail.com designates 65.55.111.96 as permitted sender) X-TMN: [90WzV8lVllJOnFdZKdtDmMqFnHppDkdU] X-Originating-Email: [michael_segel@hotmail.com] Message-ID: From: Michael Segel Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Subject: Lets talk about joins... Date: Tue, 20 Aug 2013 13:56:40 -0500 CC: "dev@hbase.apache.org" To: "user@hbase.apache.org" MIME-Version: 1.0 (Mac OS X Mail 6.5 \(1508\)) X-Mailer: Apple Mail (2.1508) X-OriginalArrivalTime: 20 Aug 2013 18:56:41.0727 (UTC) FILETIME=[01FDE8F0:01CE9DD7] X-Virus-Checked: Checked by ClamAV on apache.org When you start looking at secondary indexing, they really become = powerful when you want to join two tables.=20 (Something I thought was already being discussed....)=20 So you can use the inverted table as a secondary index with one small = glitch...=20 And then create a table of indexes. Where each row represents an index = and the columns are the rowkeys in that index.=20 (Call it a foreign key table.)=20 Now for the glitch... what happens when your row exceeds the width of = your region. ;-)=20 There's a solution for that. ;-)=20 The other issue would be asynchronous writes.=20 I figured that one should get the talk started now, rather than wait = until later.=20 This is why you want secondary indexes., the other issue... theta joins = but lets save that for later. The opinions expressed here are mine, while they may reflect a cognitive = thought, that is purely accidental.=20 Use at your own risk.=20 Michael Segel michael_segel (AT) hotmail.com