hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: Efficient time based queries - TIMERANGE or STARTROW/STOPROW?
Date Wed, 12 Apr 2017 17:39:23 GMT
Since STARTROW is specified (with uuid) in both of your examples, I think
their efficiency should be tantamount.

Cheers

On Wed, Apr 12, 2017 at 10:33 AM, Josh <jofo90@gmail.com> wrote:

> Hi,
>
> I am just getting started with HBase, and have a question about the
> efficiency of timestamp based scans.
>
> My table's row key has structure `uuid#reverse_timestamp` where
> reverse_timestamp is (java.lang.Long.MAX_VALUE - time in millis when the
> row was written). For a given uuid I want to be able to retrieve the most
> recent 10 rows in the table where timestamp is greater than x. It's
> possible that a given uuid may have many thousands of rows (with different
> timestamps).
>
> I found there are two ways to run my query:
> 1. use HBase's built in timestamps and scan a time range:
> > scan 'mytable', {STARTROW => '647b2194-fbb8-46af-95ba-f498ddc8adcc',
> TIMERANGE => [x, current_time], LIMIT => 10}
>
> 2. use only my row keys to do the scan, with STARTROW and STOPROW:
> scan 'mytable', {STARTROW => '647b2194-fbb8-46af-95ba-f498ddc8adcc',
> STOPROW='647b2194-fbb8-46af-95ba-f498ddc8adcc#x', LIMIT => 10}
>
> Both of these seem to work - but is one more efficient that the other?
>
> Thanks for any advice,
> Josh
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message