hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Marc Spaggiari <jean-m...@spaggiari.org>
Subject Re: Hbase data loss scenario
Date Thu, 27 Feb 2014 16:10:47 GMT
Hi Kiran,

2 things.

1) Is there any reason for you to use a so old HBase version? any chance to
migrate to a more recent one? 0.94.17 is out.
2) What do you mean by "I have never seen a wierd start key and end key
like this"? I don't see anything wrong with what you described. What you
keys look like? Can you go a get with key beeing "K3.xyz,138798010000.xyp"?

JM


2014-02-27 10:55 GMT-05:00 kiran <kiran.sarvabhotla@gmail.com>:

> Adding to that there are many regions with 0MB size and have CF's as
> specified in the table...
>
>
> On Thu, Feb 27, 2014 at 9:23 PM, kiran <kiran.sarvabhotla@gmail.com>
> wrote:
>
> > Hi All,
> >
> > We have been experiencing severe data loss issues from few hours. There
> > are some wierd things going on in the cluster. We were unable to locate
> the
> > data even in hdfs
> >
> > Hbase version 0.94.1
> >
> > Here is the wierd things that are going on:
> >
> > 1) Table which was once 1TB has now become 170GB with many of the regions
> > which we once 7gb are now becoming few MB's. We are no clue  what is
> > happening at all
> >
> > 2) Table is splitting (or what ever) (100 regions have become 200
> regions)
> > and ours is constantregionsplitpolicy with region size 20gb. I don't know
> > why it is even spltting
> >
> > 3) HDFS namenode dump size which we periodically backup is decreasing
> >
> > 4) And there is a region chain with start keys and end keys as, I can't
> > copy paste the exact thing. For example
> >
> > K1.xxx K2.xyz
> > K2.xyz K3.xyz,138798010000.xyp
> > K3.xyz,138798010000.xyp K4.xyq
> >
> > I have never seen a wierd start key and end key like this. We also
> suspect
> > a failed split of a region around 20GB. We looked at logs many times but
> > unable to get any sense out of it. Please help us out and we can't afford
> > data loss.
> >
> > Yesterday, There was an cluster crash of root region but we thought we
> > sucessfully restored that.But things did n't go that way.... There was a
> > consitent data loss after that.
> >
> >
> > --
> > Thank you
> > Kiran Sarvabhotla
> >
> > -----Even a correct decision is wrong when it is taken late
> >
> >
>
>
> --
> Thank you
> Kiran Sarvabhotla
>
> -----Even a correct decision is wrong when it is taken late
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message