Mailing-List: contact user-help@hbase.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@hbase.apache.org
Received-SPF: pass (nike.apache.org: domain of jtaylor@salesforce.com
 designates 64.18.3.36 as permitted sender)
Message-ID: <5179C197.7030901@salesforce.com>
Date: Thu, 25 Apr 2013 16:51:51 -0700
From: James Taylor <jtaylor@salesforce.com>
User-Agent: Mozilla/5.0 (X11; Linux x86_64;
 rv:17.0) Gecko/20130330 Thunderbird/17.0.5
MIME-Version: 1.0
To: <user@hbase.apache.org>
Subject: Re: Coprocessors
References: <5179BA1B024B06D20364008A_0_4537@p868>
In-Reply-To: <5179BA1B024B06D20364008A_0_4537@p868>
Content-Type: text/plain; charset="UTF-8"; format=flowed
Content-Transfer-Encoding: 7bit

Our performance engineer, Mujtaba Chohan has agreed to put together a 
benchmark for you. We only have a four node cluster of pretty average 
boxes, but it should give you an idea.

No performance impact for the attrib_id not being part of the PK since 
you're not filtering on them (if I understand things correctly).

A few more questions for you:
- How many rows should be use? 1B?
- How many rows would be filtered by object_id and field_type?
- Any particular key distribution or is random fine?
- What's the minimum key size we should use for object_id and 
field_type? 2 bytes each?
- Any particular kind of aggregation? count(attrib1)? sum(attrib1)? A 
sample query would be helpful

Since you're upgrading, use the latest on the 0.94 branch, 0.94.7.

Thanks,

James

On 04/25/2013 04:19 PM, Sudarshan Kadambi (BLOOMBERG/ 731 LEXIN) wrote:
> James: First of all, this looks quite promising.
>
> The table schema outlined in your other message is correct except that attrib_id will not be in the primary key. Will that be a problem with respect to the skip-scan filter's performance? (it doesn't seem like it...)
>
> Could you share any sort of benchmark numbers? I want to try this out right away, but I've to wait for my cluster administrator to upgrade us from HBase 0.92 first!
>
> ----- Original Message -----
> From: user@hbase.apache.org
> To: user@hbase.apache.org
> At: Apr 25 2013 18:45:14
>
> On 04/25/2013 03:35 PM, Gary Helmling wrote:
>>> I'm looking to write a service that runs alongside the region servers and
>>> acts a proxy b/w my application and the region servers.
>>>
>>> I plan to use the logic in HBase client's HConnectionManager, to segment
>>> my request of 1M rowkeys into sub-requests per region-server. These are
>>> sent over to the proxy which fetches the data from the region server,
>>> aggregates locally and sends data back. Does this sound reasonable or even
>>> a useful thing to pursue?
>>>
>>>
>> This is essentially what coprocessor endpoints (called through
>> HTable.coprocessorExec()) basically do.  (One difference is that there is a
>> parallel request per-region, not per-region server, though that is a
>> potential optimization that could be made as well).
>>
>> The tricky part I see for the case you describe is splitting your full set
>> of row keys up correctly per region.  You could send the full set of row
>> keys to each endpoint invocation, and have the endpoint implementation
>> filter down to only those keys present in the current region.  But that
>> would be a lot of overhead on the request side.  You could split the row
>> keys into per-region sets on the client side, but I'm not sure we provide
>> sufficient context for the Batch.Callable instance you provide to
>> coprocessorExec() to determine which region it is being invoked against.
> Sudarshan,
> In our head branch of Phoenix (we're targeting this for a 1.2 release in
> two weeks), we've implemented a skip scan filter that functions similar
> to a batched get, except:
> 1) it's more flexible in that it can jump not only from a single key to
> another single key, but also from range to range
> 2) it's faster, about 3-4x.
> 3) you can use it in combination with aggregation, since it's a filter
>
> The scan is chunked up by region and only the keys in each region are
> sent, along the lines as you and Gary have described. Then the results
> are merged together by the client automatically.
>
> How would you decompose your row key into columns? Is there a time
> component? Let me walk you through an example where you might have a
> LONG id value plus perhaps a timestamp (it work equally well if you only
> had a single column in your PK). If you provide a bit more info on your
> use case, I can tailor it more exactly.
>
> Create a schema:
>       CREATE TABLE t (key BIGINT NOT NULL, ts DATE NOT NULL, data VARCHAR
> CONSTRAINT pk PRIMARY KEY (key, ts));
>
> Populate your data using our UPSERT statement.
>
> Aggregate over a set of keys like this:
>
>       SELECT count(*) FROM t WHERE key IN (?,?,?) AND ts > ? AND ts < ?
>
> where you bind the ? at runtime (probably building the statement
> programmatically based on how many keys you're binding.
>
> Then Phoenix would jump around the key space of your table using the
> skip next hint feature provided by filters. You'd just use the regular
> JDBC ResultSet to get your count back.
>
> If you want more info and/or a benchmark of seeking over 250K keys in a
> billion row table, let me know.
>
> Thanks,
>
> James