hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stack <st...@duboce.net>
Subject Re: Regions loading too fast
Date Fri, 24 Sep 2010 23:56:38 GMT
Please paste the section from regionserver where you were getting the
EOF to pastebin.  I'd like to see exactly where (but yeah, you get the
idea moving the files aside).  Check the files too.  Are they
zero-length?  If so, please look for them in the master log and paste
me the section where we are splitting.

Thanks Jack,
St.Ack


On Fri, Sep 24, 2010 at 4:52 PM, Jack Levin <magnito@gmail.com> wrote:
> It was EOF exception, but now that I deleted edits files:
>
> Moved to trash:
> hdfs://namenode-rd.imageshack.us:9000/hbase/img96/1062260343/recovered.edits/0000000000617305532
> Moved to trash:
> hdfs://namenode-rd.imageshack.us:9000/hbase/img96/1321772129/recovered.edits/0000000000617328530
> Moved to trash:
> hdfs://namenode-rd.imageshack.us:9000/hbase/img96/257974055/recovered.edits/0000000000617238642
> Moved to trash:
> hdfs://namenode-rd.imageshack.us:9000/hbase/img97/117679080/recovered.edits/0000000000617306059
> Moved to trash:
> hdfs://namenode-rd.imageshack.us:9000/hbase/img97/221569766/recovered.edits/0000000000617242019
>
> Like these.  All of the regions have loaded... What could that have
> been?   I assume I lost some writes, but this is not a big deal to
> me... question is how to avoid something like that, is that a bug?
>
> -Jack
>
>
> On Fri, Sep 24, 2010 at 4:44 PM, Stack <stack@duboce.net> wrote:
>> What is the complaint in regionserver log when region load fails?
>> St.Ack
>>
>> On Fri, Sep 24, 2010 at 4:40 PM, Jack Levin <magnito@gmail.com> wrote:
>>> so, datanode log shows no errors whatsoever, however I do see same
>>> blocks fetched repeatedly, and the network speed is quite high, but I
>>> am unable to load _some_ regions, what could it be?
>>>
>>> 2010-09-24 16:38:42,729 INFO
>>> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src:
>>> /10.101.6.2:50010, dest: /10.101.6.2:53038, bytes: 914, op: HDFS_READ,
>>> cliID: DFSClient_hb_rs_rdaf2.prod.imageshack.com,60020,1285371202189_1285371202237,
>>> offset: 13803520, srvID: DS-1363732508-10.101.6.2-50010-1284520709569,
>>> blockid: blk_5556468858269577961_1550101, duration: 127413
>>> 2010-09-24 16:38:44,317 INFO
>>> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src:
>>> /10.101.6.2:50010, dest: /10.101.6.2:53048, bytes: 110, op: HDFS_READ,
>>> cliID: DFSClient_hb_rs_rdaf2.prod.imageshack.com,60020,1285371202189_1285371202237,
>>> offset: 32723968, srvID: DS-1363732508-10.101.6.2-50010-1284520709569,
>>> blockid: blk_364673737339632029_1347910, duration: 1140653
>>> 2010-09-24 16:38:44,318 INFO
>>> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src:
>>> /10.101.6.2:50010, dest: /10.101.6.2:53049, bytes: 38294, op:
>>> HDFS_READ, cliID:
>>> DFSClient_hb_rs_rdaf2.prod.imageshack.com,60020,1285371202189_1285371202237,
>>> offset: 32686080, srvID: DS-1363732508-10.101.6.2-50010-1284520709569,
>>> blockid: blk_364673737339632029_1347910, duration: 691929
>>> 2010-09-24 16:38:44,510 INFO
>>> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src:
>>> /10.101.6.2:50010, dest: /10.101.6.2:53054, bytes: 18021300, op:
>>> HDFS_READ, cliID:
>>> DFSClient_hb_rs_rdaf2.prod.imageshack.com,60020,1285371202189_1285371202237,
>>> offset: 0, srvID: DS-1363732508-10.101.6.2-50010-1284520709569,
>>> blockid: blk_-3781179144642915580_1571141, duration: 173548261
>>> 2010-09-24 16:38:44,525 INFO
>>> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src:
>>> /10.101.6.2:50010, dest: /10.101.6.2:53055, bytes: 506, op: HDFS_READ,
>>> cliID: DFSClient_hb_rs_rdaf2.prod.imageshack.com,60020,1285371202189_1285371202237,
>>> offset: 48700928, srvID: DS-1363732508-10.101.6.2-50010-1284520709569,
>>> blockid: blk_-176750251227749356_1535293, duration: 77045
>>> 2010-09-24 16:38:44,526 INFO
>>> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src:
>>> /10.101.6.2:50010, dest: /10.101.6.2:53056, bytes: 6182, op:
>>> HDFS_READ, cliID:
>>> DFSClient_hb_rs_rdaf2.prod.imageshack.com,60020,1285371202189_1285371202237,
>>> offset: 48695296, srvID: DS-1363732508-10.101.6.2-50010-1284520709569,
>>> blockid: blk_-176750251227749356_1535293, duration: 128270
>>>
>>>
>>>
>>>
>>> On Fri, Sep 24, 2010 at 4:32 PM, Stack <stack@duboce.net> wrote:
>>>> (Good one Ryan)
>>>>
>>>> Master is doing the assigning.  It needs to be restarted to see the
>>>> config change.
>>>>
>>>> St.Ack
>>>>
>>>> On Fri, Sep 24, 2010 at 4:28 PM, Jack Levin <magnito@gmail.com> wrote:
>>>>> Only regionserver, do I need to restart both?
>>>>>
>>>>> -jack
>>>>>
>>>>> On Fri, Sep 24, 2010 at 4:22 PM, Ryan Rawson <ryanobjc@gmail.com>
wrote:
>>>>>> Did you restart the master and the regionserver? Or just one or the
other?
>>>>>>
>>>>>> -ryan
>>>>>>
>>>>>> On Fri, Sep 24, 2010 at 4:21 PM, Jack Levin <magnito@gmail.com>
wrote:
>>>>>>> Also, even with '1' value, I see:
>>>>>>>
>>>>>>> 2010-09-24 16:20:29,983 INFO
>>>>>>> org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_OPEN:
>>>>>>> img834,1000351n.jpg,1285251664421.d09510a16c0cfd0d8a251a36229125e0.
>>>>>>> 2010-09-24 16:20:29,984 INFO
>>>>>>> org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_OPEN:
>>>>>>> img651,pict1408.jpg,1285018965749.110871465
>>>>>>> 2010-09-24 16:20:29,984 INFO
>>>>>>> org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_OPEN:
>>>>>>> img806,sam0084a.jpg,1285324613056.82a1e8ba8d2a37a591a847fb36803c45.
>>>>>>> 2010-09-24 16:20:29,985 INFO
>>>>>>> org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_OPEN:
>>>>>>> img535,screenshot1bt.png,1285323376435.fae5f3ab474196c99f10b8a936fb9ead.
>>>>>>> 2010-09-24 16:20:29,985 INFO
>>>>>>> org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_OPEN:
>>>>>>> img838,123468.jpg,1285223690165.a2903008621d1a6b6ca02441bf3b68ea.
>>>>>>> 2010-09-24 16:20:29,985 INFO
>>>>>>> org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_OPEN:
>>>>>>> img839,yug.jpg,1285230318537.c09323dbaf54130671df2a14d671fe25.
>>>>>>> 2010-09-24 16:20:29,985 INFO
>>>>>>> org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_OPEN:
>>>>>>> img821,vlcsnap78737.png,1285283076812.ea4973ce6e43d7f974613c5989647278.
>>>>>>> 2010-09-24 16:20:29,985 INFO
>>>>>>> org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_OPEN:
>>>>>>> img805,njt30scbkdmb.gif,1285322429401.f9aacdafd8064bfbcc8cd4f6930febbe.
>>>>>>> 2010-09-24 16:20:29,985 INFO
>>>>>>> org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_OPEN:
>>>>>>> img94,img1711m.jpg,1285016850260.1424182007
>>>>>>> 2010-09-24 16:20:29,986 DEBUG
>>>>>>> org.apache.hadoop.hbase.regionserver.HRegion: Creating region
>>>>>>> img840,kitbarca2.png,1285189312696.1ce170ec09384fca51297a5fe7aeb4af.
>>>>>>>
>>>>>>> Which is pretty close to concurrent.
>>>>>>>
>>>>>>> -Jack
>>>>>>>
>>>>>>> On Fri, Sep 24, 2010 at 4:16 PM, Jack Levin <magnito@gmail.com>
wrote:
>>>>>>>> Still having a problem:
>>>>>>>>
>>>>>>>> 2010-09-24 16:15:02,572 ERROR
>>>>>>>> org.apache.hadoop.hbase.regionserver.HRegionServer: Error
opening
>>>>>>>> img695,p1908101232.jpg,1285288492084.d451f05024b42f71a115951c62cdcccf.
>>>>>>>> java.io.EOFException
>>>>>>>>        at java.io.DataInputStream.readFully(DataInputStream.java:180)
>>>>>>>>        at org.apache.hadoop.io.DataOutputBuffer$Buffer.write(DataOutputBuffer.java:63)
>>>>>>>>        at org.apache.hadoop.io.DataOutputBuffer.write(DataOutputBuffer.java:101)
>>>>>>>>        at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1937)
>>>>>>>>        at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1837)
>>>>>>>>
>>>>>>>>
>>>>>>>> I changed the value to '1', and restarted the regionserver...
Note
>>>>>>>> that my hdfs is not having a problem.
>>>>>>>>
>>>>>>>> -Jack
>>>>>>>>
>>>>>>>> On Fri, Sep 24, 2010 at 4:01 PM, Stack <stack@duboce.net>
wrote:
>>>>>>>>> Try
>>>>>>>>>
>>>>>>>>>  <property>
>>>>>>>>>    <name>hbase.regions.percheckin</name>
>>>>>>>>>    <value>10</value>
>>>>>>>>>    <description>Maximum number of regions that
can be assigned in a single go
>>>>>>>>>    to a region server.
>>>>>>>>>    </description>
>>>>>>>>>  </property>
>>>>>>>>>
>>>>>>>>> What do you have now?  Whatever it is, go down from
there.
>>>>>>>>>
>>>>>>>>> St.Ack
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Fri, Sep 24, 2010 at 3:07 PM, Jack Levin <magnito@gmail.com>
wrote:
>>>>>>>>>> My regions are 1gb in size and when I cold start
the cluster I oversaturate my network links (1000 mbps) and get client dfs timeouts , anyway
to slow the m down?
>>>>>>>>>>
>>>>>>>>>> -Jack
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Mime
View raw message