hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Daniel Cryans <jdcry...@apache.org>
Subject Re: Data disappears and re-appears again after HBase cluster restart
Date Thu, 22 Jul 2010 23:38:18 GMT
I would guess clock skew, all the machines have approx the same time?
A few seconds is acceptable, but not more.


On Thu, Jul 22, 2010 at 4:34 PM, Vladimir Rodionov
<vrodionov@carrieriq.com> wrote:
> Have anybody encountered this particular bug before?
> We have been having this intermittently in our QA small cluster.
> We run a flow  which is basically custom ETL process over data stored in hdfs. Yes it
is a bunch of M/R jobs.
> One of the jobs stores data into HBase (0.20.3), the next one loads data from HBase (using
scan) performs additional transformations
> and stores data finally into RDBMS.
> Flow works fine (most of the time). It means that new HBase tables are created, data
is loaded and can be read after that during the next M/R job
> After flow finishes , data from tables (but not tables itself), sometimes, mysteriously
disappear. This is not deterministic and to get data back we need to RESTART HBase cluster.
> So HBase restart fixes the problem.
> Cluster is small (3 servers). RAM is limited - 8GB. Only 2 CPU cores per server but input
data size is small as well and the average size of disappearing tables is several 1000s rows-
> they are small. Hadoop is from CHD2. I can not get you any additional helpful information
at the time (no log files), but may be somebody has encountered this
> before and has idea how to fix it.
> Best regards,
> Vladimir Rodionov
> Principal Platform Engineer
> Carrier IQ, www.carrieriq.com
> e-mail: vrodionov@carrieriq.com

View raw message