hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From gortiz <gor...@pragsis.com>
Subject Re: Lease exception when I execute large scan with filters.
Date Fri, 11 Apr 2014 07:04:52 GMT
Well, I guessed that, what it doesn't make too much sense because it's 
so slow. I only have right now 100 rows with 1000 versions each row.
I have checked the size of the dataset and each row is about 700Kbytes 
(around 7Gb, 100rowsx1000versions). So, it should only check 100 rows x 
700Kbytes = 70Mb, since it just check the newest version. How can it 
spend too many time checking this quantity of data?

I'm generating again the dataset with a bigger blocksize (previously was 
64Kb, now, it's going to be 1Mb). I could try tunning the scanning and 
baching parameters, but I don't think they're going to affect too much.

Another test I want to do, it's generate the same dataset with just 
100versions, It should spend around the same time, right? Or am I wrong?

On 10/04/14 18:08, Ted Yu wrote:
> It should be newest version of each value.
>
> Cheers
>
>
> On Thu, Apr 10, 2014 at 9:55 AM, gortiz <gortiz@pragsis.com> wrote:
>
>> Another little question is, when the filter I'm using, Do I check all the
>> versions? or just the newest? Because, I'm wondering if when I do a scan
>> over all the table, I look for the value "5" in all the dataset or I'm just
>> looking for in one newest version of each value.
>>
>>
>> On 10/04/14 16:52, gortiz wrote:
>>
>>> I was trying to check the behaviour of HBase. The cluster is a group of
>>> old computers, one master, five slaves, each one with 2Gb, so, 12gb in
>>> total.
>>> The table has a column family with 1000 columns and each column with 100
>>> versions.
>>> There's another column faimily with four columns an one image of 100kb.
>>>   (I've tried without this column family as well.)
>>> The table is partitioned manually in all the slaves, so data are balanced
>>> in the cluster.
>>>
>>> I'm executing this sentence *scan 'table1', {FILTER => "ValueFilter(=,
>>> 'binary:5')"* in HBase 0.94.6
>>> My time for lease and rpc is three minutes.
>>> Since, it's a full scan of the table, I have been playing with the
>>> BLOCKCACHE as well (just disable and enable, not about the size of it). I
>>> thought that it was going to have too much calls to the GC. I'm not sure
>>> about this point.
>>>
>>> I know that it's not the best way to use HBase, it's just a test. I think
>>> that it's not working because the hardware isn't enough, although, I would
>>> like to try some kind of tunning to improve it.
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On 10/04/14 14:21, Ted Yu wrote:
>>>
>>>> Can you give us a bit more information:
>>>>
>>>> HBase release you're running
>>>> What filters are used for the scan
>>>>
>>>> Thanks
>>>>
>>>> On Apr 10, 2014, at 2:36 AM, gortiz <gortiz@pragsis.com> wrote:
>>>>
>>>>   I got this error when I execute a full scan with filters about a table.
>>>>> Caused by: java.lang.RuntimeException: org.apache.hadoop.hbase.regionserver.LeaseException:
>>>>> org.apache.hadoop.hbase.regionserver.LeaseException: lease
>>>>> '-4165751462641113359' does not exist
>>>>>      at org.apache.hadoop.hbase.regionserver.Leases.removeLease(Leases.java:231)
>>>>>
>>>>>      at org.apache.hadoop.hbase.regionserver.HRegionServer.
>>>>> next(HRegionServer.java:2482)
>>>>>      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>>      at sun.reflect.NativeMethodAccessorImpl.invoke(
>>>>> NativeMethodAccessorImpl.java:39)
>>>>>      at sun.reflect.DelegatingMethodAccessorImpl.invoke(
>>>>> DelegatingMethodAccessorImpl.java:25)
>>>>>      at java.lang.reflect.Method.invoke(Method.java:597)
>>>>>      at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(
>>>>> WritableRpcEngine.java:320)
>>>>>      at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(
>>>>> HBaseServer.java:1428)
>>>>>
>>>>> I have read about increase the lease time and rpc time, but it's not
>>>>> working.. what else could I try?? The table isn't too big. I have been
>>>>> checking the logs from GC, HMaster and some RegionServers and I didn't
see
>>>>> anything weird. I tried as well to try with a couple of caching values.
>>>>>
>>>
>> --
>> *Guillermo Ortiz*
>> /Big Data Developer/
>>
>> Telf.: +34 917 680 490
>> Fax: +34 913 833 301
>> C/ Manuel Tovar, 49-53 - 28034 Madrid - Spain
>>
>> _http://www.bidoop.es_
>>
>>

-- 
*Guillermo Ortiz*
/Big Data Developer/

Telf.: +34 917 680 490
Fax: +34 913 833 301
C/ Manuel Tovar, 49-53 - 28034 Madrid - Spain

_http://www.bidoop.es_


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message