hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh <jof...@gmail.com>
Subject Re: Efficient time based queries - TIMERANGE or STARTROW/STOPROW?
Date Wed, 12 Apr 2017 17:43:21 GMT
Hi Ted,

Thanks for the fast reply!
Ok I see - just out of interest, if I changed my row key to be
uuid#timestamp  (instead of uuid#reverse_timestamp) - would the timestamp
approach still be equally efficient? I just want to understand whether or
not the timestamp approach is relying on the ordering of my row keys.

Josh

On Wed, Apr 12, 2017 at 6:39 PM, Ted Yu <yuzhihong@gmail.com> wrote:

> Since STARTROW is specified (with uuid) in both of your examples, I think
> their efficiency should be tantamount.
>
> Cheers
>
> On Wed, Apr 12, 2017 at 10:33 AM, Josh <jofo90@gmail.com> wrote:
>
> > Hi,
> >
> > I am just getting started with HBase, and have a question about the
> > efficiency of timestamp based scans.
> >
> > My table's row key has structure `uuid#reverse_timestamp` where
> > reverse_timestamp is (java.lang.Long.MAX_VALUE - time in millis when the
> > row was written). For a given uuid I want to be able to retrieve the most
> > recent 10 rows in the table where timestamp is greater than x. It's
> > possible that a given uuid may have many thousands of rows (with
> different
> > timestamps).
> >
> > I found there are two ways to run my query:
> > 1. use HBase's built in timestamps and scan a time range:
> > > scan 'mytable', {STARTROW => '647b2194-fbb8-46af-95ba-f498ddc8adcc',
> > TIMERANGE => [x, current_time], LIMIT => 10}
> >
> > 2. use only my row keys to do the scan, with STARTROW and STOPROW:
> > scan 'mytable', {STARTROW => '647b2194-fbb8-46af-95ba-f498ddc8adcc',
> > STOPROW='647b2194-fbb8-46af-95ba-f498ddc8adcc#x', LIMIT => 10}
> >
> > Both of these seem to work - but is one more efficient that the other?
> >
> > Thanks for any advice,
> > Josh
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message