hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Daniel Cryans <jdcry...@apache.org>
Subject Re: HBase querying across region servers
Date Thu, 28 Apr 2011 22:12:06 GMT
(please don't write back personally unless it's really personal)

That's all fine, rows are always contained in a single region. By
state I meant if you created some fancy filter yourself and decided to
keep some state where the filtering of one row could affect how others
would be filtered.

Like I said earlier, that's not the case with the ones shipped with
HBase, so again yes this is going to work.

The documentation is all contained in the javadoc.

J-D

On Thu, Apr 28, 2011 at 3:07 PM, Ajay Govindarajan
<agovindarajan@yahoo.com> wrote:
> SingleColumnValueFilter filter = new SingleColumnValueFilter(
>                     Bytes.toBytes(columnFamily), Bytes.toBytes(key),
>                     CompareOp.EQUAL, Bytes.toBytes("someValue"));
> filter.setFilterIfMissing(true);
> Scan scan = new Scan();
> scan.setFilter(filter);
> ResultScanner scanner = hTable.getScanner(scan);
> for (Result r = scanner.next(); r != null; r = scanner.next()) {
>      String rowKey = Bytes.toString(r.getRow());
>     NavigableMap<byte[], byte[]> map = r.getFamilyMap(Bytes
>                 .toBytes(columnFamily));
> }
>
>
> Will this code work across regions?
>
> Also you say that " if you happened to have some sort of state in your
> filter"? As far as I can see only the reset() and filterRow() methods seem
> to alter the state. Are there more methods that alter the state? If so could
> you please point me to the relevant documentation?
>
> thanks very much
> -ajay
>
>
> ________________________________
> From: Jean-Daniel Cryans <jdcryans@apache.org>
> To: user@hbase.apache.org; Ajay Govindarajan <agovindarajan@yahoo.com>
> Sent: Thursday, April 28, 2011 1:41 PM
> Subject: Re: HBase querying across region servers
>
> Can you give an example of what you're trying to do?
>
> BTW what we mean when we say that filters don't work across region
> servers (actually it's more across regions, so it's also a problem on
> a single machine) is that if you happened to have some sort of state
> in your filter, it wouldn't be carried from one region to another. I
> don't think any of the filters HBase ships with have that sort of
> issue, so they can all be used to scan a full table if that's what you
> fancy.
>
> J-D
>
> On Thu, Apr 28, 2011 at 1:19 PM, Ajay Govindarajan
> <agovindarajan@yahoo.com> wrote:
>> Sorry, what I meant was Scans using Filters. There are use-cases for which
>> we will not know the row keys. So we have to resort to filters using
>> SingleColumnValueFilter or PrefixFilter
>> Since filters don't work across region servers, are there any alternative
>> APIs or workarounds? Or is there a fundamental schema design issue here?
>>
>> thanks
>> -ajay
>>
>>
>>
>>
>>
>>
>>
>> ________________________________
>> From: Bennett Andrews <bennett.j.andrews@gmail.com>
>> To: user@hbase.apache.org; Ajay Govindarajan <agovindarajan@yahoo.com>
>> Sent: Thursday, April 28, 2011 12:54 PM
>> Subject: Re: HBase querying across region servers
>>
>> Scans will work across region servers transparently.  All you need to do
>> is
>> specify a start row and end row.  Use this when you reading sequential
>> rows
>> as it will be faster.
>>
>> -bennett
>>
>>
>>
>> On Thu, Apr 28, 2011 at 2:30 PM, Ajay Govindarajan
>> <agovindarajan@yahoo.com>wrote:
>>
>>> We have a bunch of synchronous requests that will read and write data to
>>> hbase. I have written some code that uses the HBase  client library to
>>> use
>>> Puts for writes, Gets for reads with rowkeys and Scans for reads with
>>> filters. Currently we have only one region server (since its a dev
>>> environment) so the queries work fine. Eventually we will have multiple
>>> region servers in our production environment. From the documentation it
>>> seems that Gets and Puts will work across multiple region servers while
>>> scans don't.
>>>
>>> So how do I solve this problem to get scans to work across multiple
>>> region
>>> servers? Should I avoid using scans and replace it with Gets using
>>> filters ?
>>> Is that a big perfrmance overhead?
>>> Or is there a framework to perform scan like queries across multiple
>>> region
>>> servers?
>>>
>>> Any help will be appreciated.
>>>
>>> thanks
>>> -ajay
>>>
>
>
>

Mime
View raw message