Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 0829710C18 for ; Mon, 21 Oct 2013 11:39:33 +0000 (UTC) Received: (qmail 97679 invoked by uid 500); 21 Oct 2013 11:39:23 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 97616 invoked by uid 500); 21 Oct 2013 11:39:23 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 97607 invoked by uid 99); 21 Oct 2013 11:39:21 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 21 Oct 2013 11:39:21 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of msegel_hadoop@hotmail.com designates 65.55.111.107 as permitted sender) Received: from [65.55.111.107] (HELO blu0-omc2-s32.blu0.hotmail.com) (65.55.111.107) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 21 Oct 2013 11:39:13 +0000 Received: from BLU0-SMTP319 ([65.55.111.71]) by blu0-omc2-s32.blu0.hotmail.com with Microsoft SMTPSVC(6.0.3790.4675); Mon, 21 Oct 2013 04:38:52 -0700 X-TMN: [BeavuBvTQL+VmOGVBxT7hSHcKargfatI] X-Originating-Email: [msegel_hadoop@hotmail.com] Message-ID: Received: from 173-15-87-33-illinois.hfc.comcastbusiness.net ([173.15.87.33]) by BLU0-SMTP319.phx.gbl over TLS secured channel with Microsoft SMTPSVC(6.0.3790.4675); Mon, 21 Oct 2013 04:38:51 -0700 Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 (Mac OS X Mail 6.6 \(1510\)) Subject: Re: row filter - binary comparator at certain range From: Michael Segel In-Reply-To: Date: Mon, 21 Oct 2013 06:38:50 -0500 Content-Transfer-Encoding: quoted-printable References: <1328812425.45012.YahooMailNeo@web121703.mail.ne1.yahoo.com> <1328814206.80929.YahooMailNeo@web121704.mail.ne1.yahoo.com> To: user@hbase.apache.org X-Mailer: Apple Mail (2.1510) X-OriginalArrivalTime: 21 Oct 2013 11:38:51.0590 (UTC) FILETIME=[1D619E60:01CECE52] X-Virus-Checked: Checked by ClamAV on apache.org Sorry if this double posts... may have used the wrong email first. On Oct 21, 2013, at 6:36 AM, Michael Segel = wrote: >=20 > Lets look at what you are trying to do...=20 >=20 > You want to take data where the key is a timestamp (long datatype)=20 > You append it to a salt value 1=3D10 or 0-9 your example doesn't = say...=20 >=20 > You have a couple of problems with your choice of a key...=20 >=20 > First after your initial 10 splits, you will still end up with writing = everything to the left side of the region.=20 > This means that when the region splits... all writes will still be to = the left leaving your regions still 1/2 the size that they could be with = the with the exception of your last set of salted regions. In this case = 10 which will grow and then split.=20 >=20 > Is this a bad thing? Maybe yes, maybe no.=20 >=20 > The issue is that you will then have to write 10 queries with a start = key and a stop key to get the range of your timestamp.=20 >=20 >=20 > That would work, however...=20 >=20 > 1) Justify why you want/need to use a timestamp as a key for the row. >=20 > I'd say tell us more about the use case and why the access pattern. >=20 > Salting is bad in that the salt is disassociated to the underlying = key. > Taking the key's hash, truncating it and preprending (if this is an = actual word) to the key gives you a random key where if you know the = rowkey, you can hash it. >=20 > My suggestion is that you rethink your key...=20 >=20 >=20 >=20 > On Oct 20, 2013, at 11:31 PM, Tony Duan wrote: >=20 >> Alex Vasilenko writes: >>=20 >>>=20 >>> Lars, >>>=20 >>> But how it will behave, when I have salt at the beginning of the key = to >>> properly shard table across regions? Imagine row key of format >>> salt:timestamp and rows goes like this: >>> ... >>> 1:15 >>> 1:16 >>> 1:17 >>> 1:23 >>> 2:3 >>> 2:5 >>> 2:12 >>> 2:15 >>> 2:19 >>> 2:25 >>> ... >>>=20 >>> And I want to find all rows, that has second part (timestamp) in = range >>> 15-25. What startKey and endKey should be used? >>>=20 >>> Alexandr Vasilenko >>> Web Developer >>> Skype:menterr >>> mob: +38097-611-45-99 >>>=20 >>> 2012/2/9 lars hofhansl >> Hi, >> Alexandr Vasilenko >> Have you ever resolved this issue?i am also facing this iusse. >> i also want implement this functionality. >> Imagine row key of format >> salt:timestamp and rows goes like this: >> ... >> 1:15 >> 1:16 >> 1:17 >> 1:23 >> 2:3 >> 2:5 >> 2:12 >> 2:15 >> 2:19 >> 2:25 >> ... >>=20 >> And I want to find all rows, that has second part (timestamp) in = range >> 15-25. >>=20 >> Could you please tell me how you resolve this ? >> thanks in advance. >>=20 >>=20 >> Tony duan >>=20 >>=20 >=20 > The opinions expressed here are mine, while they may reflect a = cognitive thought, that is purely accidental.=20 > Use at your own risk.=20 > Michael Segel > michael_segel (AT) hotmail.com >=20 >=20 >=20 >=20 >=20 >=20