Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 08CCEEB65 for ; Mon, 28 Jan 2013 15:54:49 +0000 (UTC) Received: (qmail 97916 invoked by uid 500); 28 Jan 2013 15:54:47 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 97869 invoked by uid 500); 28 Jan 2013 15:54:46 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 97861 invoked by uid 99); 28 Jan 2013 15:54:46 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 28 Jan 2013 15:54:46 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of michael_segel@hotmail.com designates 65.55.111.92 as permitted sender) Received: from [65.55.111.92] (HELO blu0-omc2-s17.blu0.hotmail.com) (65.55.111.92) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 28 Jan 2013 15:54:37 +0000 Received: from BLU0-SMTP374 ([65.55.111.72]) by blu0-omc2-s17.blu0.hotmail.com with Microsoft SMTPSVC(6.0.3790.4675); Mon, 28 Jan 2013 07:54:17 -0800 X-EIP: [clgR/yDJ/mHGSQ9FAFVw/Cc90/U7u8+W] X-Originating-Email: [michael_segel@hotmail.com] Message-ID: Received: from [10.151.83.72] ([166.137.99.161]) by BLU0-SMTP374.phx.gbl over TLS secured channel with Microsoft SMTPSVC(6.0.3790.4675); Mon, 28 Jan 2013 07:54:10 -0800 References: MIME-Version: 1.0 (1.0) In-Reply-To: Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable CC: "user@hbase.apache.org" X-Mailer: iPad Mail (10A523) From: Michel Segel Subject: Re: how to model data based on "time bucket" Date: Mon, 28 Jan 2013 09:54:08 -0600 To: "user@hbase.apache.org" X-OriginalArrivalTime: 28 Jan 2013 15:54:15.0859 (UTC) FILETIME=[B97A0030:01CDFD6F] X-Virus-Checked: Checked by ClamAV on apache.org Tough one in that if your events are keyed on time alone, you will hit a hot= spot on write. Reads,not so much... TSDB would be a good start ... You may not need 'buckets' but just a time stamp and set up a start and sto= p key values. Sent from a remote device. Please excuse any typos... Mike Segel On Jan 28, 2013, at 7:06 AM, Oleg Ruchovets wrote: > Hi , >=20 > I have such row data structure: >=20 > event_id | time > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > event1 | 10:07 > event2 | 10:10 > event3 | 10:12 >=20 > event4 | 10:20 > event5 | 10:23 > event6 | 10:25 >=20 >=20 > Numbers of records is 50-100 million. >=20 >=20 > Question: >=20 > I need to find group of events starting form eventX and enters to the time= > window bucket =3D T. >=20 >=20 > For example: if T=3D7 munutes. > Starting from event event1- {event1, event2 , event3} were detected durint= > 7 minutes. >=20 > Starting from event event2- {event2 , event3} were detected durint 7 > minutes. >=20 > Starting from event event4 - {event4, event5 , event6} were detected durin= g > 7 minutes. > Is there a way to model the data in hbase to get? >=20 > Thanks