hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Marc Harris (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-428) Under continuous upload of rows, WrongRegionExceptions are thrown that reach the client even after retries
Date Sat, 16 Feb 2008 14:35:07 GMT

    [ https://issues.apache.org/jira/browse/HBASE-428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12569569#action_12569569
] 

Marc Harris commented on HBASE-428:
-----------------------------------

The patch seems to have got past this error, but now a new one (out of heap) occurs later.
Possibly this bug should be considered fixed and a new one opened.

Current results:
I no longer get WrongRegionException's
After approx 700,000 rows uploaded, the region server throws an OutOfMemoryError, followed
by many "Server not running" exceptions (exception log below).
I am able to restart the hbase region and master servers (and the client app), and store another
800,000 rows before the same OutOfMemoryError.
After that, I can restart the hbase region and master servers (and the client app), but continuing
the upload causes more OutOfMemoryError exceptions quickly.

Full logs will be sent to stack by e-mail.

008-02-16 02:24:38,884 INFO org.apache.hadoop.hbase.HLog: new log writer created at hdfs://server14:54310/hbase/log_66.135.42.137_1203123804816_60020/hlog.dat.322
2008-02-16 02:25:45,751 DEBUG org.apache.hadoop.hbase.HRegion: Started memcache flush for
region pagefetch,http://www.marketwatch.com/hdml wap2 20071222205256,1203126936284. Size 62.6m
2008-02-16 02:25:57,378 FATAL org.apache.hadoop.hbase.HRegionServer: Set stop flag in regionserver/0:0:0:0:0:0:0:0:60020.cacheFlusher
java.lang.OutOfMemoryError: Java heap space
	at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:39)
	at java.nio.ByteBuffer.allocate(ByteBuffer.java:312)
	at org.apache.hadoop.dfs.DFSClient$DFSOutputStream$Packet.<init>(DFSClient.java:1518)
	at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.writeChunk(DFSClient.java:2125)
	at org.apache.hadoop.fs.FSOutputSummer.writeChecksumChunk(FSOutputSummer.java:141)
	at org.apache.hadoop.fs.FSOutputSummer.write1(FSOutputSummer.java:100)
	at org.apache.hadoop.fs.FSOutputSummer.write(FSOutputSummer.java:86)
	at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:41)
	at java.io.DataOutputStream.write(DataOutputStream.java:90)
	at org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:977)
	at org.apache.hadoop.io.MapFile$Writer.append(MapFile.java:188)
	at org.apache.hadoop.hbase.HStoreFile$BloomFilterMapFile$Writer.append(HStoreFile.java:721)
	at org.apache.hadoop.hbase.HStore.internalFlushCache(HStore.java:1113)
	at org.apache.hadoop.hbase.HStore.flushCache(HStore.java:1081)
	at org.apache.hadoop.hbase.HRegion.internalFlushcache(HRegion.java:954)
	at org.apache.hadoop.hbase.HRegion.flushcache(HRegion.java:852)
	at org.apache.hadoop.hbase.HRegionServer$Flusher.run(HRegionServer.java:417)
2008-02-16 02:25:57,405 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 60020,
call batchUpdate(pagefetch,http://mobility.mobi/showthread.php?goto=newpost&t=2677 wap2
20071223005632,1203126357490, 9223372036854775807, org.apache.hadoop.hbase.io.BatchUpdate@9bad5a)
from 66.135.42.137:56275: error: java.io.IOException: Server not running
java.io.IOException: Server not running
	at org.apache.hadoop.hbase.HRegionServer.checkOpen(HRegionServer.java:1626)
	at org.apache.hadoop.hbase.HRegionServer.batchUpdate(HRegionServer.java:1429)
	at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:585)
	at org.apache.hadoop.hbase.ipc.HbaseRPC$Server.call(HbaseRPC.java:413)
	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:910)
2008-02-16 02:25:57,406 INFO org.apache.hadoop.ipc.Server: IPC Server handler 5 on 60020,
call getClosestRowBefore(.META.,,1, pagefetch,http://pda.physorg.com/lofi-news-seafloor-fault-tsunami_114370203.html
wap2 20080102111657,999999999999999, 9223372036854775807) from 66.135.42.137:56275: error:
java.io.IOException: Server not running



> Under continuous upload of rows, WrongRegionExceptions are thrown that reach the client
even after retries
> ----------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-428
>                 URL: https://issues.apache.org/jira/browse/HBASE-428
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 0.1.0, 0.2.0
>         Environment: Linux 2.6.9-67.0.1.ELsmp #1 SMP Wed Dec 19 16:01:12 EST 2007 i686
athlon i386 GNU/Linux
>            Reporter: Marc Harris
>            Assignee: stack
>            Priority: Blocker
>         Attachments: 428.patch, filesbysize.csv, lsr, selectfrommeta.txt
>
>
> I have installed 0.16.0 rc 1 which I believe contains a fix for similar issue HBASE-138,
 but I still see the same problem.
> - I am using a single node.
> - The client application runs in a single thread, loading data into a single table.
> - I get good throughput of about 200 rows/sec to start with, with occasional significant
drops due to NotServingRegionException's that are recoverable on client retry (internal to
hbase).
> - After 54 minutes, and about 500,000 rows I start to see WrongRegionException's in the
client application, i.e. real failures. (Note that this compares to 0.15.3 which would being
to throw NotServingRegionExceptions after a few tens of thousands of rows).
> My data consists of a single table with 5 column families. The data written is as follows:>>
> key: a URL
> family 1: a small string, often emty, 2 longs, 1 int
> family 2: a byte averaging averaging between 1k and 10k, a small string
> family 3: several columns with different names per row, values of small strings
> family 4: most rows have zero columns, some rows have 1 or more columns with a UL value
> The URLs are typically "long-ish" URL as seen when crawling a site, not short home page
URLs  
>  
> I am assuming the data is stored in files of the form <hbaseroot>//<tablename>/<9digitnum>/data/mapfiles/<19digitnum>/data.
I have attached a csv file showing the distribution of size of these files. Average size is
19Mb, but the sizes are not evenly distributed at all
> Here are two sample exceptions thrown, copied from the region server log:
> 2008-02-08 02:08:22,495 INFO org.apache.hadoop.ipc.Server: IPC Server handler 4 on 60020,
call batchUpdate(pagefetch,http://galsn1.mobilook.mobiwap.com/bm/listproducts;jsessionid=D2ED1EB898163CDB27135DC2CF6958B3.197B?rsi=78011
wap2 20080102052924,1202401088077, 9223372036854775807, org.apache.hadoop.hbase.io.BatchUpdate@feb215)
from 66.135.42.137:38484: error: org.apache.hadoop.hbase.WrongRegionException: Requested row
out of range for HRegion pagefetch,http://galsn1.mobilook.mobiwap.com/bm/listproducts;jsessionid=D2ED1EB898163CDB27135DC2CF6958B3.197B?rsi=78011
wap2 20080102052924,1202401088077, startKey='http://galsn1.mobilook.mobiwap.com/bm/listproducts;jsessionid=D2ED1EB898163CDB27135DC2CF6958B3.197B?rsi=78011
wap2 20080102052924', getEndKey()='http://galsn1.mobilook.mobiwap.com/bm/listproducts;jsessionid=D2ED1EB898163CDB27135DC2CF6958B3.197B?rsi=78011
wap2 20080102052924', row='http://go2purdue.com/Redeemer_University.cfm?pt=2&sp=2&vid=1199243289_3X02X1468757255&rpt=2&kt=4&kp=1
wap2 20080102081237'
> org.apache.hadoop.hbase.WrongRegionException: Requested row out of range for HRegion
pagefetch,http://galsn1.mobilook.mobiwap.com/bm/listproducts;jsessionid=D2ED1EB898163CDB27135DC2CF6958B3.197B?rsi=78011
wap2 20080102052924,1202401088077, startKey='http://galsn1.mobilook.mobiwap.com/bm/listproducts;jsessionid=D2ED1EB898163CDB27135DC2CF6958B3.197B?rsi=78011
wap2 20080102052924', getEndKey()='http://galsn1.mobilook.mobiwap.com/bm/listproducts;jsessionid=D2ED1EB898163CDB27135DC2CF6958B3.197B?rsi=78011
wap2 20080102052924', row='http://go2purdue.com/Redeemer_University.cfm?pt=2&sp=2&vid=1199243289_3X02X1468757255&rpt=2&kt=4&kp=1
wap2 20080102081237'
>         at org.apache.hadoop.hbase.HRegion.checkRow(HRegion.java:1486)
>         at org.apache.hadoop.hbase.HRegion.obtainRowLock(HRegion.java:1531)
>         at org.apache.hadoop.hbase.HRegion.batchUpdate(HRegion.java:1226)
>         at org.apache.hadoop.hbase.HRegionServer.batchUpdate(HRegionServer.java:1433)
>         at sun.reflect.GeneratedMethodAccessor10.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:585)
>         at org.apache.hadoop.hbase.ipc.HbaseRPC$Server.call(HbaseRPC.java:413)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:910)
> 2008-02-08 02:08:22,696 INFO org.apache.hadoop.ipc.Server: IPC Server handler 6 on 60020,
call batchUpdate(pagefetch,http://galsn1.mobilook.mobiwap.com/bm/listproducts;jsessionid=D2ED1EB898163CDB27135DC2CF6958B3.197B?rsi=78011
wap2 20080102052924,1202401088077, 9223372036854775807, org.apache.hadoop.hbase.io.BatchUpdate@15d9be1)
from 66.135.42.137:38484: error: org.apache.hadoop.hbase.WrongRegionException: Requested row
out of range for HRegion pagefetch,http://galsn1.mobilook.mobiwap.com/bm/listproducts;jsessionid=D2ED1EB898163CDB27135DC2CF6958B3.197B?rsi=78011
wap2 20080102052924,1202401088077, startKey='http://galsn1.mobilook.mobiwap.com/bm/listproducts;jsessionid=D2ED1EB898163CDB27135DC2CF6958B3.197B?rsi=78011
wap2 20080102052924', getEndKey()='http://galsn1.mobilook.mobiwap.com/bm/listproducts;jsessionid=D2ED1EB898163CDB27135DC2CF6958B3.197B?rsi=78011
wap2 20080102052924', row='http://go2umass.com/Travel.cfm?pt=2&sp=2&vid=1199230721_3X04X1485302803&rpt=2&kt=5&kp=8
wap2 20080102081239'
> org.apache.hadoop.hbase.WrongRegionException: Requested row out of range for HRegion
pagefetch,http://galsn1.mobilook.mobiwap.com/bm/listproducts;jsessionid=D2ED1EB898163CDB27135DC2CF6958B3.197B?rsi=78011
wap2 20080102052924,1202401088077, startKey='http://galsn1.mobilook.mobiwap.com/bm/listproducts;jsessionid=D2ED1EB898163CDB27135DC2CF6958B3.197B?rsi=78011
wap2 20080102052924', getEndKey()='http://galsn1.mobilook.mobiwap.com/bm/listproducts;jsessionid=D2ED1EB898163CDB27135DC2CF6958B3.197B?rsi=78011
wap2 20080102052924', row='http://go2umass.com/Travel.cfm?pt=2&sp=2&vid=1199230721_3X04X1485302803&rpt=2&kt=5&kp=8
wap2 20080102081239'
>         at org.apache.hadoop.hbase.HRegion.checkRow(HRegion.java:1486)
>         at org.apache.hadoop.hbase.HRegion.obtainRowLock(HRegion.java:1531)
>         at org.apache.hadoop.hbase.HRegion.batchUpdate(HRegion.java:1226)
>         at org.apache.hadoop.hbase.HRegionServer.batchUpdate(HRegionServer.java:1433)
>         at sun.reflect.GeneratedMethodAccessor10.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:585)
>         at org.apache.hadoop.hbase.ipc.HbaseRPC$Server.call(HbaseRPC.java:413)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:910)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message