Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id B5B29924A for ; Thu, 29 Sep 2011 19:41:28 +0000 (UTC) Received: (qmail 35141 invoked by uid 500); 29 Sep 2011 19:41:27 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 35107 invoked by uid 500); 29 Sep 2011 19:41:27 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 35099 invoked by uid 99); 29 Sep 2011 19:41:27 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 29 Sep 2011 19:41:27 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of jameson@bronto.com designates 209.85.216.169 as permitted sender) Received: from [209.85.216.169] (HELO mail-qy0-f169.google.com) (209.85.216.169) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 29 Sep 2011 19:41:20 +0000 Received: by qyl38 with SMTP id 38so4675507qyl.14 for ; Thu, 29 Sep 2011 12:40:59 -0700 (PDT) Received: by 10.224.208.201 with SMTP id gd9mr8250849qab.253.1317325259041; Thu, 29 Sep 2011 12:40:59 -0700 (PDT) Received: from [192.168.119.48] (BRONTO-SOFT.car1.Raleigh1.Level3.net. [4.59.160.2]) by mx.google.com with ESMTPS id hn10sm2806554qab.20.2011.09.29.12.40.57 (version=TLSv1/SSLv3 cipher=OTHER); Thu, 29 Sep 2011 12:40:58 -0700 (PDT) Message-ID: <4E84C9C9.30603@bronto.com> Date: Thu, 29 Sep 2011 15:40:57 -0400 From: Jameson Lopp Reply-To: jameson@bronto.com Organization: Bronto Software User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.23) Gecko/20110922 Thunderbird/3.1.15 MIME-Version: 1.0 To: Doug Meil CC: "user@hbase.apache.org" Subject: Re: setTimeRange for HBase Increment References: In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org Thanks! Nevertheless, can anyone confirm / deny if the scenario I described would play out in that manner? Just want to make sure I understand the functionality. -- Jameson Lopp Software Engineer Bronto Software, Inc On 09/29/2011 03:32 PM, Doug Meil wrote: > > Here are a few links on table cleanup and major compactions... > > http://hbase.apache.org/book.html#schema.minversions (ttl related) > > http://hbase.apache.org/book.html#perf.deleting.queue > > http://hbase.apache.org/book.html#compaction > > > > > > On 9/29/11 2:29 PM, "Ted Yu" wrote: > >> Doug Meil may point you to related doc. >> >> Take a look at this as well: >> https://issues.apache.org/jira/browse/HBASE-4241 >> >> On Thu, Sep 29, 2011 at 11:22 AM, Jameson Lopp wrote: >> >>> Hm, well I didn't mention a number of other requirements for the feature >>> I'm building, but long story short, I need to keep track of millions to >>> billions of these counters and need the lookup time to be as close to >>> constant time as possible, thus I was really hoping to avoid doing table >>> scans. >>> >>> I'll admit I know nothing of the dangers of auto-pruning; is there an >>> article / documentation I could read about it? Google wasn't very >>> helpful. >>> >>> >>> -- >>> Jameson Lopp >>> Software Engineer >>> Bronto Software, Inc >>> >>> >>> On 09/29/2011 02:12 PM, Jean-Daniel Cryans wrote: >>> >>>> My advice usually regarding timestamps is if it's part of your data >>>> model, it should appear somewhere in an HBase key. 99% of the time >>>> overloading the HBase timestamps is a bad idea, especially with >>>> counters since there's auto-pruning done in the Memstore! >>>> >>>> I would suggest you make time part of your row key, maybe one counter >>>> per day, and then set the TTL on your table to 30 days. Then all you >>>> need to do is a sequential scan for those 30 days maybe with a prefix >>>> that refers to some event id. >>>> >>>> OpenTSDB is another way of doing it: http://opentsdb.net/ >>>> >>>> J-D >>>> >>>> On Thu, Sep 29, 2011 at 11:04 AM, Jameson Lopp >>>> wrote: >>>> >>>>> I wish to store a count of 30-day trailing event data (e.g. # of >>>>> clicks >>>>> in >>>>> past 30 days) and ended up reading the documentation for setTimeRange >>>>> in >>>>> the >>>>> Increment operation. >>>>> http://hbase.apache.org/**apidocs/org/apache/hadoop/** >>>>> >>>>> hbase/client/Increment.html#**getTimeRange%28%29>>>> g/apidocs/org/apache/hadoop/hbase/client/Increment.html#getTimeRange%28 >>>>> %29> >>>>> >>>>> I was hoping someone could clarify if it works as I'm imagining in >>>>> this >>>>> example scenario. >>>>> >>>>> 1) Current click count is 0 >>>>> >>>>> 2) I process a click and I perform an increment operation with the >>>>> time >>>>> range set to minStamp = now and maxStamp = 30 days from now >>>>> >>>>> 3) I query for the value immediately and find it to be 1 >>>>> >>>>> 4) Assuming no other clicks come in, if I query for the value in 31 >>>>> days, >>>>> it >>>>> will be returned as 0 >>>>> >>>>> In essence, I'm looking for a way to set a TTL on my increment >>>>> operation. >>>>> Is >>>>> this how it actually works? The documentation is a bit vague and I >>>>> could >>>>> imagine several other scenarios. >>>>> -- >>>>> Jameson Lopp >>>>> Software Engineer >>>>> Bronto Software, Inc >>>>> >>>>> >