hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Purtell <apurt...@apache.org>
Subject Re: Hbase Region Server crash when uploading large data
Date Thu, 02 Jul 2009 15:49:24 GMT
Chu,

HBase 0.18.1 is an old version. 0.19.3 is the latest release. 0.20.0 is
about to be released. We are in the best position to help you if you can
upgrade. 

Since it sounds like you have no legacy data, that you are starting with
a blank HBase install and importing data, you should not require a
migration step. Is this the case? If so, we have a version of 0.20.0
(prerelease) that will work with Hadoop 0.18.3. If you are interested,
contact me privately and I'll help you to set it up and try it.

    - Andy





________________________________
From: stchu <stchu.cloud@gmail.com>
To: hbase-user@hadoop.apache.org
Sent: Thursday, July 2, 2009 2:59:09 AM
Subject: Hbase Region Server crash when uploading large data

Hi,

I try to import a large HDFS doc (about 4GB) into HBase.
The map class works with TextInputFormat provided by Hadoop, and
the reduce class is implement with TableReduce.
The Map process complete without any problem but Region Server crashed in
reduce stages.
The log shows:

ent: Retrying connect to server: /140.96.89.57:61020. Already tried 8 time(s).
2009-07-02 17:42:34,588 INFO org.apache.hadoop.ipc.Client: Retrying
connect to server: /140.96.89.57:61020. Already tried 9 time(s).
2009-07-02 17:42:37,600 INFO org.apache.hadoop.ipc.Client: Retrying
connect to server: /140.96.89.57:61020. Already tried 0 time(s).
2009-07-02 17:42:38,600 INFO org.apache.hadoop.ipc.Client: Retrying
connect to server: /140.96.89.57:61020. Already tried 1 time(s).
2009-07-02 17:42:39,600 INFO org.apache.hadoop.ipc.Client: Retrying
connect to server: /140.96.89.57:61020. Already tried 2 time(s).
2009-07-02 17:42:40,600 INFO org.apache.hadoop.ipc.Client: Retrying
connect to server: /140.96.89.57:61020. Already tried 3 time(s).
2009-07-02 17:42:41,600 INFO org.apache.hadoop.ipc.Client: Retrying
connect to server: /140.96.89.57:61020. Already tried 4 time(s).
2009-07-02 17:42:42,601 INFO org.apache.hadoop.ipc.Client: Retrying
connect to server: /140.96.89.57:61020. Already tried 5 time(s).
2009-07-02 17:42:43,601 INFO org.apache.hadoop.ipc.Client: Retrying
connect to server: /140.96.89.57:61020. Already tried 6 time(s).
2009-07-02 17:42:44,601 INFO org.apache.hadoop.ipc.Client: Retrying
connect to server: /140.96.89.57:61020. Already tried 7 time(s).
2009-07-02 17:42:45,601 INFO org.apache.hadoop.ipc.Client: Retrying
connect to server: /140.96.89.57:61020. Already tried 8 time(s).
2009-07-02 17:42:46,601 INFO org.apache.hadoop.ipc.Client: Retrying
connect to server: /140.96.89.57:61020. Already tried 9 time(s).
2009-07-02 17:42:51,613 INFO org.apache.hadoop.ipc.Client: Retrying
connect to server: /140.96.89.57:61020. Already tried 0 time(s).
2009-07-02 17:42:52,613 INFO org.apache.hadoop.ipc.Client: Retrying
connect to server: /140.96.89.57:61020. Already tried 1 time(s).
2009-07-02 17:42:53,613 INFO org.apache.hadoop.ipc.Client: Retrying
connect to server: /140.96.89.57:61020. Already tried 2 time(s).
....

and in one of the region server, which logs:
2009-07-02 10:38:07,807 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 6 on 61020, call batchUpdate([B@1da5af5, row =>
-27.456528333_153.006395000, {column => UserData:trail_id, value =>
'...'}, -1) from 140.96.89.205:49560: error:
org.apache.hadoop.hbase.NotServingRegionException: Region
TestTable,,1246500940788 closed
org.apache.hadoop.hbase.NotServingRegionException: Region
TestTable,,1246500940788 closed
        at org.apache.hadoop.hbase.regionserver.HRegion.obtainRowLock(HRegion.java:1810)
        at org.apache.hadoop.hbase.regionserver.HRegion.getLock(HRegion.java:1875)
        at org.apache.hadoop.hbase.regionserver.HRegion.batchUpdate(HRegion.java:1406)
        at org.apache.hadoop.hbase.regionserver.HRegion.batchUpdate(HRegion.java:1380)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdate(HRegionServer.java:1114)
        at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.hbase.ipc.HbaseRPC$Server.call(HbaseRPC.java:554)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:888)
2009-07-02 10:38:07,808 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 2 on 61020, call batchUpdate([B@bf1d4e, row =>
-27.449870000000_153.011180000000, {column => UserData:trail_id, value
=> '...'}, -1) from 140.96.89.193:52723: error:
org.apache.hadoop.hbase.NotServingRegionException: Region
TestTable,,1246500940788 closed
org.apache.hadoop.hbase.NotServingRegionException: Region
TestTable,,1246500940788 closed
        at org.apache.hadoop.hbase.regionserver.HRegion.obtainRowLock(HRegion.java:1810)
        at org.apache.hadoop.hbase.regionserver.HRegion.getLock(HRegion.java:1875)
        at org.apache.hadoop.hbase.regionserver.HRegion.batchUpdate(HRegion.java:1406)
        at org.apache.hadoop.hbase.regionserver.HRegion.batchUpdate(HRegion.java:1380)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdate(HRegionServer.java:1114)
        at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.hbase.ipc.HbaseRPC$Server.call(HbaseRPC.java:554)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:888)
=======================================================================================================

The job fails finally. Can anyone give me some guides? I run this job
on the 4 nodes (one master and 3 slaves) cluster with Hadoop 0.18.3
and Hbase 0.18.1.
Thanks a lot!!

chu



      
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message