hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Daniel Cryans <jdcry...@apache.org>
Subject Re: HBase querying across region servers
Date Thu, 28 Apr 2011 22:12:06 GMT
(please don't write back personally unless it's really personal)

That's all fine, rows are always contained in a single region. By
state I meant if you created some fancy filter yourself and decided to
keep some state where the filtering of one row could affect how others
would be filtered.

Like I said earlier, that's not the case with the ones shipped with
HBase, so again yes this is going to work.

The documentation is all contained in the javadoc.


On Thu, Apr 28, 2011 at 3:07 PM, Ajay Govindarajan
<agovindarajan@yahoo.com> wrote:
> SingleColumnValueFilter filter = new SingleColumnValueFilter(
>                     Bytes.toBytes(columnFamily), Bytes.toBytes(key),
>                     CompareOp.EQUAL, Bytes.toBytes("someValue"));
> filter.setFilterIfMissing(true);
> Scan scan = new Scan();
> scan.setFilter(filter);
> ResultScanner scanner = hTable.getScanner(scan);
> for (Result r = scanner.next(); r != null; r = scanner.next()) {
>      String rowKey = Bytes.toString(r.getRow());
>     NavigableMap<byte[], byte[]> map = r.getFamilyMap(Bytes
>                 .toBytes(columnFamily));
> }
> Will this code work across regions?
> Also you say that " if you happened to have some sort of state in your
> filter"? As far as I can see only the reset() and filterRow() methods seem
> to alter the state. Are there more methods that alter the state? If so could
> you please point me to the relevant documentation?
> thanks very much
> -ajay
> ________________________________
> From: Jean-Daniel Cryans <jdcryans@apache.org>
> To: user@hbase.apache.org; Ajay Govindarajan <agovindarajan@yahoo.com>
> Sent: Thursday, April 28, 2011 1:41 PM
> Subject: Re: HBase querying across region servers
> Can you give an example of what you're trying to do?
> BTW what we mean when we say that filters don't work across region
> servers (actually it's more across regions, so it's also a problem on
> a single machine) is that if you happened to have some sort of state
> in your filter, it wouldn't be carried from one region to another. I
> don't think any of the filters HBase ships with have that sort of
> issue, so they can all be used to scan a full table if that's what you
> fancy.
> J-D
> On Thu, Apr 28, 2011 at 1:19 PM, Ajay Govindarajan
> <agovindarajan@yahoo.com> wrote:
>> Sorry, what I meant was Scans using Filters. There are use-cases for which
>> we will not know the row keys. So we have to resort to filters using
>> SingleColumnValueFilter or PrefixFilter
>> Since filters don't work across region servers, are there any alternative
>> APIs or workarounds? Or is there a fundamental schema design issue here?
>> thanks
>> -ajay
>> ________________________________
>> From: Bennett Andrews <bennett.j.andrews@gmail.com>
>> To: user@hbase.apache.org; Ajay Govindarajan <agovindarajan@yahoo.com>
>> Sent: Thursday, April 28, 2011 12:54 PM
>> Subject: Re: HBase querying across region servers
>> Scans will work across region servers transparently.  All you need to do
>> is
>> specify a start row and end row.  Use this when you reading sequential
>> rows
>> as it will be faster.
>> -bennett
>> On Thu, Apr 28, 2011 at 2:30 PM, Ajay Govindarajan
>> <agovindarajan@yahoo.com>wrote:
>>> We have a bunch of synchronous requests that will read and write data to
>>> hbase. I have written some code that uses the HBase  client library to
>>> use
>>> Puts for writes, Gets for reads with rowkeys and Scans for reads with
>>> filters. Currently we have only one region server (since its a dev
>>> environment) so the queries work fine. Eventually we will have multiple
>>> region servers in our production environment. From the documentation it
>>> seems that Gets and Puts will work across multiple region servers while
>>> scans don't.
>>> So how do I solve this problem to get scans to work across multiple
>>> region
>>> servers? Should I avoid using scans and replace it with Gets using
>>> filters ?
>>> Is that a big perfrmance overhead?
>>> Or is there a framework to perform scan like queries across multiple
>>> region
>>> servers?
>>> Any help will be appreciated.
>>> thanks
>>> -ajay

View raw message