hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vladimir Rodionov <vrodio...@carrieriq.com>
Subject Data disappears and re-appears again after HBase cluster restart
Date Thu, 22 Jul 2010 23:34:55 GMT
Have anybody encountered this particular bug before?
We have been having this intermittently in our QA small cluster.

We run a flow  which is basically custom ETL process over data stored in hdfs. Yes it is a
bunch of M/R jobs.
One of the jobs stores data into HBase (0.20.3), the next one loads data from HBase (using
scan) performs additional transformations
and stores data finally into RDBMS.

Flow works fine (most of the time). It means that new HBase tables are created, data is loaded
and can be read after that during the next M/R job

After flow finishes , data from tables (but not tables itself), sometimes, mysteriously disappear.
This is not deterministic and to get data back we need to RESTART HBase cluster.
So HBase restart fixes the problem.

Cluster is small (3 servers). RAM is limited - 8GB. Only 2 CPU cores per server but input
data size is small as well and the average size of disappearing tables is several 1000s rows-
they are small. Hadoop is from CHD2. I can not get you any additional helpful information
at the time (no log files), but may be somebody has encountered this
before and has idea how to fix it. 

Best regards,
Vladimir Rodionov
Principal Platform Engineer
Carrier IQ, www.carrieriq.com
e-mail: vrodionov@carrieriq.com

View raw message