hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vladimir Rodionov <vladrodio...@gmail.com>
Subject Re: Scanner with explicit columns list is very slow
Date Mon, 14 Oct 2013 21:46:35 GMT
I profiled the last test case (5 columns total and 2 in a scan).

80% of StoreScanner.next() execution time are in :

StoreScanner.reseek() - 71%
ScanQueryMathcer.getKeyForNextColumn() - 6%
ScanQueryMathcer.getKeyForNextRow() - 2%

Should I open JIRA?


On Mon, Oct 14, 2013 at 2:03 PM, Vladimir Rodionov
<vladrodionov@gmail.com>wrote:

> I modified tests:
>
> Now I created table with one CF and 5 columns: CQ1,..,CQ5
>
> 1. Scan.addColumn(CF, CQ1);
>     Scan.addColumn(CF, CQ3);
>
> 2. Scan.addFamily(CF);
>
> Scan performance from block cache:
>
> 1.  400K rows per sec
> 2.  1.6M rows per sec
>
> The explicit columns scan performance  is even worse in this case. It is
> much faster to scan the WHOLE rows and filter columns later in a Filter,
> than specify columns directly in a Scan.
>
> Definitely needs to be explained/investigated.
>
>
> On Mon, Oct 14, 2013 at 11:18 AM, Vladimir Rodionov <
> vrodionov@carrieriq.com> wrote:
>
>> Its 0.94.6 and there is chance that the issue has been fixed already
>>
>> Simple table: one column + one qualifier
>>
>> Two type of scans:
>>
>> 1. Scan.addFamily(CF)
>>
>> 2. Scan.addColumn(CF, CQ)
>>
>> Both run on block cache (all data in memory)
>>
>> Tested on StoreScanner directly.
>>
>> 1. 4.2M KVs per sec per one thread
>> 2. 1.5M KVs per second per one thread.
>>
>> The difference? First scanner's ScanQueryMatcher returns INCLUDE, DONE,
>> second - INCLUDE_NEXT_ROW, DONE
>> The cost of Row's reseek is huge.
>>
>> Best regards,
>> Vladimir Rodionov
>> Principal Platform Engineer
>> Carrier IQ, www.carrieriq.com
>> e-mail: vrodionov@carrieriq.com
>>
>>
>> Confidentiality Notice:  The information contained in this message,
>> including any attachments hereto, may be confidential and is intended to be
>> read only by the individual or entity to whom this message is addressed. If
>> the reader of this message is not the intended recipient or an agent or
>> designee of the intended recipient, please note that any review, use,
>> disclosure or distribution of this message or its attachments, in any form,
>> is strictly prohibited.  If you have received this message in error, please
>> immediately notify the sender and/or Notifications@carrieriq.com and
>> delete or destroy any copy of this message and its attachments.
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message