hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sebastian Bauer <ad...@ugame.net.pl>
Subject Re: Strange problem
Date Tue, 26 Oct 2010 08:52:33 GMT
W dniu 26.10.2010 06:50, Stack pisze:
> On Mon, Oct 25, 2010 at 1:06 PM, Sebastian Bauer <admin@ugame.net.pl> wrote:
>> This problem was related to this:
>>
>> 2010-10-25 08:13:00,933 DEBUG
>> org.apache.hadoop.hbase.master.LoadBalancer: Skipping load balancing.
>> servers=2 regions=5073 average=2536.5 mostloaded=2537 leastloaded=2537
>> 2010-10-25 08:13:01,690 WARN
>> org.apache.hadoop.hbase.master.CatalogJanitor: REGIONINFO_QUALIFIER is
>> empty in
>> keyvalues={ZNANYLEKARZ_CTU,67-w_2010_36_6C8010B775B90B091CA2B04D2CA7CD49,1287059084055.83fda231f456837b4c9316d433bb89bc./info:server/1287909181224/Put/vlen=24,
>> ZNANY
>> LEKARZ_CTU,67-w_2010_36_6C8010B775B90B091CA2B04D2CA7CD49,1287059084055.83fda231f456837b4c9316d433bb89bc./info:serverstartcode/1287909181224/Put/vlen=8}
>>
> Seems like same null regioninfo issue Sebastien.
>
> See below.
>
>
Inside hadoop directory we have this:

hbase@db2a:hadoop$ ./bin/hadoop fs -ls
/hbase_bck/ZNANYLEKARZ_CTU/83fda231f456837b4c9316d433bb89bc
Found 3 items
-rw-r--r-- 2 hbase supergroup 976 2010-10-23 20:51
/hbase_bck/ZNANYLEKARZ_CTU/83fda231f456837b4c9316d433bb89bc/.regioninfo
drwxr-xr-x - hbase supergroup 0 2010-10-23 20:51
/hbase_bck/ZNANYLEKARZ_CTU/83fda231f456837b4c9316d433bb89bc/.tmp
drwxr-xr-x - hbase supergroup 0 2010-10-23 20:51
/hbase_bck/ZNANYLEKARZ_CTU/83fda231f456837b4c9316d433bb89bc/c


hbase@db2a:hadoop$ ./bin/hadoop fs -ls
/hbase_bck/ZNANYLEKARZ_CTU/83fda231f456837b4c9316d433bb89bc/c
Found 3 items
-rw-r--r-- 2 hbase supergroup 8366531 2010-10-23 20:51
/hbase_bck/ZNANYLEKARZ_CTU/83fda231f456837b4c9316d433bb89bc/c/2051016052314330858
-rw-r--r-- 2 hbase supergroup 1411 2010-10-23 20:51
/hbase_bck/ZNANYLEKARZ_CTU/83fda231f456837b4c9316d433bb89bc/c/2371747914411710526
-rw-r--r-- 2 hbase supergroup 17896 2010-10-23 20:51
/hbase_bck/ZNANYLEKARZ_CTU/83fda231f456837b4c9316d433bb89bc/c/8812317362551976051


hbase@db2a:hadoop$ ./bin/hadoop fs -ls
/hbase_bck/ZNANYLEKARZ_CTU/83fda231f456837b4c9316d433bb89bc/.tmp
hbase@db2a:hadoop$

hbase@db2a:hadoop$ ./bin/hadoop fs -cat
/hbase_bck/ZNANYLEKARZ_CTU/83fda231f456837b4c9316d433bb89bc/.regioninfo
%73-g_20A8406500E1593C617327A77113AFB9+��mZNANYLEKARZ_CTU,67-w_2010_36_6C8010B775B90B091CA2B04D2CA7CD49,1287059084055.83fda231f456837b4c9316d433bb89bc.-67-w_2010_36_6C8010B775B90B091CA2B04D2CA7CD49ZNANYLEKARZ_CTUIS_ROOTfalseIS_METAfalsBLOOMFILTERROWREPLICATION_SCOPE0COMPRESSIONLZVERSIONS1TTL-1
BLOCKSIZE65536 IN_MEMORYfalse
BLOCKCACHEtrue�3��

REGION => {NAME =>
'ZNANYLEKARZ_CTU,67-w_2010_36_6C8010B775B90B091CA2B04D2CA7CD49,1287059084055.83fda231f456837b4c9316d433bb89bc.',
STARTKEY => '67-w_2010_36_6C8010B775B90B091CA2B04D2CA7CD49', ENDKEY =>
'73-g_20A8406500E1593C617327A77113AFB9', ENCODED =>
83fda231f456837b4c9316d433bb89bc, TABLE => {{NAME => 'ZNANYLEKARZ_CTU',
FAMILIES => [{NAME => 'c', BLOOMFILTER => 'ROW', REPLICATION_SCOPE =>
'0', COMPRESSION => 'LZO', VERSIONS => '1', TTL => '-1', BLOCKSIZE =>
'65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}]}}hbase@db2a:hadoop$

So its not empty(i think so)


>> 2010-10-25 08:13:01,690 ERROR
>> org.apache.hadoop.hbase.master.CatalogJanitor: Caught exception
>> java.lang.NullPointerException
>> 2010-10-25 08:13:24,385 INFO
>> org.apache.hadoop.hbase.master.ServerManager: regionservers=2,
>> averageload=2538
>>
> You have 2500 regions on two servers Sebastien?
Yes, but they are realy small(8mb) and we using about 100 regions, other
regions have past data and they are only used when we want some
statistics. Load on this machines is about 5-10% and we have about 10k QPS

>
>> because after droping this table problem has gone, Problem with CatalogJanitor was
making problem with (re)starting hbase:
>>
>> 2010-10-23 20:16:17,890 DEBUG
>>  org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation:
>>  Cached location for .META.,,1.1028785192 is
>>  db2a.goldenline.pl:60020
>>  2010-10-23 20:16:18,432 FATAL org.apache.hadoop.hbase.master.HMaster:
>>  Unhandled exception. Starting
>>  shutdown.
>>
>>  java.lang.NullPointerException
>>
>>        at
>>  org.apache.hadoop.hbase.util.Writables.getWritable(Writables.java:75)
>>
>>        at
>>  org.apache.hadoop.hbase.util.Writables.getHRegionInfo(Writables.java:119)
>>
>>        at
>>  org.apache.hadoop.hbase.client.MetaScanner$1.processRow(MetaScanner.java:188)
>>
>>        at
>>  org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:157)
>>
>>        at
>>  org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:69)
>>
>>        at
>>  org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:54)
>>
>>        at
>>  org.apache.hadoop.hbase.client.MetaScanner.listAllRegions(MetaScanner.java:195)
>>
>>       at
>>  org.apache.hadoop.hbase.master.AssignmentManager.assignAllUserRegions(AssignmentManager.java:1048)
>>
>>        at
>>  org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:379)
>>
>>        at
>>  org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:265)
>>
>>  2010-10-23 20:16:18,433 INFO org.apache.hadoop.hbase.master.HMaster:
>>  Aborting
>>
>>  2010-10-23 20:16:18,433 DEBUG org.apache.hadoop.hbase.master.HMaster:
>>  Stopping service threads
>>
>>
>> I still have access to data with very similar problem.
>>
> Sorry, I don't follow?  Its not starting for you still?
>
> I made https://issues.apache.org/jira/browse/HBASE-3151 to cover the
> NPE.  Can you fill into the issue the version of hbase you are running
> (Is it TRUNK)?
Its starting after droping .META. and table causing problem and doing
add_table.rb, but i have backup with this data.
Its almost TRUNK


> Thanks,
> St.Ack
>
Thanks
S.
>>
>> W dniu 24.10.2010 19:14, Stack pisze:
>>> Please thread dump the regionserver using 100%.  Do it a few times and
>>> pastebin the result so we can take a look.
>>> St.Ack
>>>
>>> On Sun, Oct 24, 2010 at 1:52 AM, Sebastian Bauer <admin@ugame.net.pl> wrote:
>>>> Today i try to make some denormalization in my hbase, so i run scaner on
>>>> one table with day-user counters and ICV to table with
>>>> week-user,month-user and global-user counters and after about 2 hours
>>>> program(written in python) raise an exception in scanner because scanner
>>>> expires, main problem was that hbase running very slow when program
>>>> making ICV, one region server using 100%, but nothing was putting in
>>>> logs, any splits, compactions etc..
>>>>
>>>> Do you have any ideas what can cause this?
>>>>
>>>> Thank you!
>>>> S.
>>>>
>>
>> --
>>
>> Pozdrawiam
>> Sebastian Bauer
>> -----------------------------------------------------
>> http://tikecik.pl
>>
>>


-- 

Pozdrawiam
Sebastian Bauer
-----------------------------------------------------
http://tikecik.pl


Mime
View raw message