hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From <guillaume.vil...@orange-ftgroup.com>
Subject Issues/Problems concerning hbase data insertion
Date Wed, 16 Sep 2009 15:35:26 GMT
Hi all,
Being in the process of evaluating hbase for managing "bigtable" (to give an idea ~ 1G entries
of 500 bytes). We are now facing some issues and i would like to have comments concerning
what i have noticed.
Our configuration is hadoop 0.19.1 and hbase 0.19.3, both hadoop-default/site.xml and hbase-default/site.xml
are attached, 15 nodes (16 or 8 Go RAM and 1,3To disk, linux kernel 2.6.24-standard, java
version "1.6.0_12").
For now the test case is on one IndexedTable (without at the moment using the index column)
with 25 M of entries/rows:
Map is formatting the data and 15 reduces are BatchUpdating the textual data (like url and
simple text fields < 500 bytes)
All processes (hadoop/hbase) are started with -Xmx1000m and IndexedTable is configured with
AutoCommit to false.

ISSUE 1, We need one column index to have "fast" UI query (for instance as an answer to Web
form we could expect waiting at max 30sec). The only documentation I found concerning indexed
column comes from http://rajeev1982.blogspot.com/2009/06/secondary-indexes-in-hbase.html
Instead of using the indextable properties in hbase-site.xml (that I have tested but that
gives very poor performance and also lost entries...) I pass the properties to the job through
a -conf indextable_properties.xml (file is in attachement). I suppose that putting the indextable
properties into the hbase-site.xml apply to the whole hbase cluster making the whole performance
significantly decreasing ?
The best perf were reached passing through the -conf option of the Tool.run method.

ISSUE2, we are facing serious regionserver problems often leading to regionserver shutdown
like:
2009-09-16 10:21:15,887 INFO org.apache.hadoop.hbase.regionserver.MemcacheFlusher: Too many
store files for region urlsdata-validation,forum.telecharger.01net.com/index.php?page=01net_voter&forum=microhebdo&category=5&topic=344142&post=5653085,1253089082422:
23, waiting

or

2009-09-14 16:39:24,611 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates
for 'IPC Server handler 1 on 60020' on region urlsdata-validation,www.abovetopsecret.com/forum/thread119/pg1&title=Underground+Communities,1252939031807:
Memcache size 128.0m is >= than blocking 128.0m size
2009-09-14 16:39:24,942 INFO org.apache.hadoop.hdfs.DFSClient: Exception in createBlockOutputStream
java.io.IOException: Could not read from stream
2009-09-14 16:39:24,942 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning block blk_-873614322830930554_111500
2009-09-14 16:39:31,180 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block blk_-873614322830930554_111500
bad datanode[0] nodes == null
2009-09-14 16:39:31,181 WARN org.apache.hadoop.hdfs.DFSClient: Could not get block locations.
Source file "/hbase/urlsdata-validation/1733902030/info/mapfiles/2690714750206504745/data"
- Aborting...
2009-09-14 16:39:31,241 FATAL org.apache.hadoop.hbase.regionserver.MemcacheFlusher: Replay
of hlog required. Forcing server shutdown

I've read some hbase/jira issues (hbase-1415, hbase-1058, hbase-1084...) concerning similar
problems,
but i cannot get a clear idea of what kind of fix is proposed ?


ISSUE3, Theses problems are causing table.commit() IOException losing all the entries:
org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to contact region server
192.168.255.8:60020 for region urlsdata-validation,twitter.com/statuses/434272962,1253089707924,
row 'www.harmonicasurcher.com', but failed after 10 attempts.
Exceptions:
java.io.IOException: Call to /192.168.255.8:60020 failed on local exception: java.io.EOFException
java.net.ConnectException: Call to /192.168.255.8:60020 failed on connection exception: java.net.ConnectException:
Connection refused 

Is there a way to get back the uncommitted entries (there are many of them because we are
in AutoCommit false)
to resubmit them later ?
To give an idea, we sometime lost about 170 000 entries out of 25M entries due to this commit
exception.


Guillaume Viland (guillaume.viland@orange-ftgroup.com)
FT/TGPF/OPF/PORTAIL/DOP Sophia Antipolis



*********************************
This message and any attachments (the "message") are confidential and intended solely for
the addressees. 
Any unauthorised use or dissemination is prohibited.
Messages are susceptible to alteration. 
France Telecom Group shall not be liable for the message if altered, changed or falsified.
If you are not the intended addressee of this message, please cancel it immediately and inform
the sender.
********************************


Mime
View raw message