hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HBASE-4811) Support reverse Scan
Date Wed, 11 Feb 2015 18:25:15 GMT

     [ https://issues.apache.org/jira/browse/HBASE-4811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

stack updated HBASE-4811:
-------------------------
    Release Note: 
== What is it?

HBase 0.98 introduces a reverse scanner, so that you can scan rows in reverse order as well
as the default order. Previously, if you wanted to scan in reverse, you needed to save rows
in reverse order.

== Why do you want it?

One common use case is data stored with timestamps. Previously, when you created the schema,
you needed to make a design decision about whether you wanted faster access to records with
old timestamps or new ones. If you wanted to be able to scan in reverse, you needed to store
the data twice, sorted forward and reverse. 

With the reverse scanner in HBase 0.98, you can choose whether to scan your data in either
direction. The reverse scanner is only a few percent slower than the default scanner.

== How do you set it up?
No setup is required.

== How do you use it?
Use the Scan.setReversed(boolean reversed) API call:
Scan.setReversed(true)

If you specify a startRow and stopRow, to scan in reverse, the startRow needs to be lexicographically
after the stopRow.

Refer to the API documentation for more information about the Scan API (http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html).

NOTE: Rows are reversed only. Columns in a row still sort forward even if you specify reverse
on your scan.

  was:
== What is it?

HBase 0.98 introduces a reverse scanner, so that you can scan rows in reverse order as well
as the default order. Previously, if you wanted to scan in reverse, you needed to save rows
in reverse order.

== Why do you want it?

One common use case is data stored with timestamps. Previously, when you created the schema,
you needed to make a design decision about whether you wanted faster access to records with
old timestamps or new ones. If you wanted to be able to scan in reverse, you needed to store
the data twice, sorted forward and reverse. 

With the reverse scanner in HBase 0.98, you can choose whether to scan your data in either
direction. The reverse scanner is only a few percent slower than the default scanner.

== How do you set it up?
No setup is required.

== How do you use it?
Use the Scan.setReversed(boolean reversed) API call:
Scan.setReversed(true)

If you specify a startRow and stopRow, to scan in reverse, the startRow needs to be lexicographically
after the stopRow.

Refer to the API documentation for more information about the Scan API (http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html).


> Support reverse Scan
> --------------------
>
>                 Key: HBASE-4811
>                 URL: https://issues.apache.org/jira/browse/HBASE-4811
>             Project: HBase
>          Issue Type: New Feature
>          Components: Client
>    Affects Versions: 0.20.6, 0.94.7
>            Reporter: John Carrino
>            Assignee: chunhui shen
>             Fix For: 0.98.0
>
>         Attachments: 4811-0.94-v22.txt, 4811-0.94-v23.txt, 4811-0.94-v25.txt, 4811-0.94-v3.txt,
4811-trunk-v10.txt, 4811-trunk-v29.patch, 4811-trunk-v5.patch, HBase-4811-0.94-v2.txt, HBase-4811-0.94.3modified.txt,
hbase-4811-0.94 v21.patch, hbase-4811-0.94-v24.patch, hbase-4811-trunkv1.patch, hbase-4811-trunkv11.patch,
hbase-4811-trunkv12.patch, hbase-4811-trunkv13.patch, hbase-4811-trunkv14.patch, hbase-4811-trunkv15.patch,
hbase-4811-trunkv16.patch, hbase-4811-trunkv17.patch, hbase-4811-trunkv18.patch, hbase-4811-trunkv19.patch,
hbase-4811-trunkv20.patch, hbase-4811-trunkv21.patch, hbase-4811-trunkv24.patch, hbase-4811-trunkv24.patch,
hbase-4811-trunkv25.patch, hbase-4811-trunkv26.patch, hbase-4811-trunkv27.patch, hbase-4811-trunkv28.patch,
hbase-4811-trunkv4.patch, hbase-4811-trunkv6.patch, hbase-4811-trunkv7.patch, hbase-4811-trunkv8.patch,
hbase-4811-trunkv9.patch
>
>
> Reversed scan means scan the rows backward. 
> And StartRow bigger than StopRow in a reversed scan.
> For example, for the following rows:
> aaa/c1:q1/value1
> aaa/c1:q2/value2
> bbb/c1:q1/value1
> bbb/c1:q2/value2
> ccc/c1:q1/value1
> ccc/c1:q2/value2
> ddd/c1:q1/value1
> ddd/c1:q2/value2
> eee/c1:q1/value1
> eee/c1:q2/value2
> you could do a reversed scan from 'ddd' to 'bbb'(exclude) like this:
> Scan scan = new Scan();
> scan.setStartRow('ddd');
> scan.setStopRow('bbb');
> scan.setReversed(true);
> for(Result result:htable.getScanner(scan)){
>  System.out.println(result);
> }
> Aslo you could do the reversed scan with shell like this:
> hbase> scan 'table',{REVERSED => true,STARTROW=>'ddd', STOPROW=>'bbb'}
> And the output is:
> ddd/c1:q1/value1
> ddd/c1:q2/value2
> ccc/c1:q1/value1
> ccc/c1:q2/value2
> All the documentation I find about HBase says that if you want forward and reverse scans
you should just build 2 tables and one be ascending and one descending.  Is there a fundamental
reason that HBase only supports forward Scan?  It seems like a lot of extra space overhead
and coding overhead (to keep them in sync) to support 2 tables.  
> I am assuming this has been discussed before, but I can't find the discussions anywhere
about it or why it would be infeasible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message