hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Purtell <apurt...@yahoo.com>
Subject Re: Issue on data load with 0.20.0-rc2
Date Thu, 20 Aug 2009 10:51:48 GMT
Hi Mathias,

Sounds like FS level errors are making your region servers sick. Grepping through your datanode
logs should turn up relevant (possibly even revealing) info. Please post what you find.

   - Andy

--- On Thu, 8/20/09, Mathias Herberts <mathias.herberts@gmail.com> wrote:

From: Mathias Herberts <mathias.herberts@gmail.com>
Subject: Issue on data load with 0.20.0-rc2
To: hbase-dev@hadoop.apache.org
Date: Thursday, August 20, 2009, 1:49 AM


I've reinstalled HBase 0.20.0-rc2 yesterday on my 5 node cluster and
reimported some data into it.

My data is imported via an MR job. The Mapper reads SequenceFiles,
generates a new key for each value (unique across values and
deterministic), and outputs the new K,V. The Reducer reads those
records and inserts the V into an HTable, the row key being the K.

The import MR job completes, showing 866,587,147 Map Input Records,
Map Output Records and Reduce Input Records. The Reducer outputs the
number of records it inserted into the HTable and the total across all
10 reducers comes handy at the same value of 866,587,147 (which is
indeed how many records I have).

Several Reducers attempts have failed with the following type of error:

org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to
contact region server Some server, retryOnlyOne=true, index=0,
islastrow=false, tries=9, numtries=10, i=4856, listsize=13108,
location=address:, regioninfo: REGION => {NAME =>
'foo,,1250702379325', STARTKEY => '', ENDKEY =>
'00AZRPXCWSF8W\xBEO\x7F\xFF\xFF\xFA', ENCODED => 9856138, TABLE =>
{{NAME => 'domirama', FAMILIES => [{NAME => 'copy', VERSIONS => '1',
COMPRESSION => 'NONE', TTL => '2147483647', BLOCKSIZE => '65536',
IN_MEMORY => 'false', BLOCKCACHE => 'true'}]}},
region=domirama,,1250702379325 for region domirama,,1250702379325, row
'00AZRPXCLZM7\x5E\xA0\xDF\x7F\xFF\xFF\xEE', but failed after 10

    at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.processBatchOfRows(HConnectionManager.java:1041)
    at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:582)
    at org.apache.hadoop.hbase.client.HTable.put(HTable.java:448)
    at domirama.mapreduce.MR0004$Reducer.reduce(MR0004.java:235)
    at domirama.mapreduce.MR0004$Reducer.reduce(MR0004.java:151)
    at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:174)
    at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:543)
    at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:410)
    at org.apache.hadoop.mapred.Child.main(Child.java:170)

I then ran another MR job that counts the rows in the table and that
job only found 866,166,470 records!

There are a few errors in the regionserver logs (failed compactions or
compacted files that could not be moved), but no errors related to the
regions mentioned in the above errors.

I already encountered an issue similar with rc1 and previously with
trunk, so I guess there is still something in rc2 that makes my use
case fail.


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message