hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Todd Lipcon <t...@cloudera.com>
Subject Re: Data loss due to region server failure
Date Thu, 02 Sep 2010 15:37:41 GMT
On Thu, Sep 2, 2010 at 7:52 AM, Ted Yu <yuzhihong@gmail.com> wrote:

> By default, this config doesn't appear in hdfs-site.xml or hbase-site.xml
> (cdh3b2)
>
> I searched for this config in 0.20.6 source code and didn't find reference:
>
> tyumac:hbase-0.20.6 tyu$ find . -name *.java -exec grep
> 'dfs.support.append'
> {} \; -print
>
> tyumac:hbase-0.20.6 tyu$ cd ~/hadoop-0.20.2+320/
> tyumac:hadoop-0.20.2+320 tyu$ find . -name *.java -exec grep
> 'dfs.support.append' {} \; -print
>   * configured with the parameter dfs.support.append set to true, otherwise
> ./src/hdfs/org/apache/hadoop/hdfs/protocol/ClientProtocol.java
>      boolean supportAppends = conf.getBoolean("dfs.support.append", false);
> ./src/hdfs/org/apache/hadoop/hdfs/server/datanode/FSDataset.java
>    this.supportAppends = conf.getBoolean("dfs.support.append", false);
>                            " Please refer to dfs.support.append
> configuration parameter.");
> ./src/hdfs/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
>
> I think Cloudera people can provide some hint here.
>

Yes, we changed the default inside hdfs-default.xml (inside the jar)

-Todd


>
> On Thu, Sep 2, 2010 at 7:48 AM, Stack <stack@duboce.net> wrote:
>
> > On Thu, Sep 2, 2010 at 3:34 AM, Gagandeep Singh
> > <gagandeep.singh@paxcel.net> wrote:
> > > Hi Daniel
> > >
> > > I have downloaded hadoop-0.20.2+320.tar.gz from this location
> > > http://archive.cloudera.com/cdh/3/
> >
> >
> > That looks right, yes.
> >
> > > And also changed the *dfs.support.append* flag to *true* in your *
> > > hdfs-site.xml* as mentioned here
> > > http://wiki.apache.org/hadoop/Hbase/HdfsSyncSupport.
> > >
> >
> > That sounds right too.  As Ted suggests, you put it in to all configs
> > (though I believe it enabled by default on that branch -- in the UI
> > you'd see a warning if it was NOT enabled).
> >
> > > But data loss is still happening. Am I using the right version?
> > > Is there any other settings that I need to make so that data gets
> flushed
> > to
> > > HDFS.
> > >
> >
> > It looks like you are doing the right thing.  Can we see master log
> please?
> >
> > Thanks,
> > St.Ack
> >
> >
> > > Thanks,
> > > Gagan
> > >
> > >
> > >
> > > On Thu, Aug 26, 2010 at 11:57 PM, Jean-Daniel Cryans <
> > jdcryans@apache.org>wrote:
> > >
> > >> That, or use CDH3b2.
> > >>
> > >> J-D
> > >>
> > >> On Thu, Aug 26, 2010 at 11:22 AM, Gagandeep Singh
> > >> <gagandeep.singh@paxcel.net> wrote:
> > >> > Thanks Daniel
> > >> >
> > >> > It means I have to checkout the code from branch and build it on my
> > local
> > >> > machine.
> > >> >
> > >> > Gagan
> > >> >
> > >> >
> > >> > On Thu, Aug 26, 2010 at 9:51 PM, Jean-Daniel Cryans <
> > jdcryans@apache.org
> > >> >wrote:
> > >> >
> > >> >> Then I would expect some form of dataloss yes, because stock hadoop
> > >> >> 0.20 doesn't have any form of fsync so HBase doesn't know whether
> the
> > >> >> data made it to the datanodes when appending to the WAL. Please
use
> > >> >> the 0.20-append hadoop branch with HBase 0.89 or cloudera's CDH3b2.
> > >> >>
> > >> >> J-D
> > >> >>
> > >> >> On Thu, Aug 26, 2010 at 7:22 AM, Gagandeep Singh
> > >> >> <gagandeep.singh@paxcel.net> wrote:
> > >> >> > HBase - 0.20.5
> > >> >> > Hadoop - 0.20.2
> > >> >> >
> > >> >> > Thanks,
> > >> >> > Gagan
> > >> >> >
> > >> >> >
> > >> >> >
> > >> >> > On Thu, Aug 26, 2010 at 7:11 PM, Jean-Daniel Cryans <
> > >> jdcryans@apache.org
> > >> >> >wrote:
> > >> >> >
> > >> >> >> Hadoop and HBase version?
> > >> >> >>
> > >> >> >> J-D
> > >> >> >>
> > >> >> >> On Aug 26, 2010 5:36 AM, "Gagandeep Singh" <
> > >> gagandeep.singh@paxcel.net>
> > >> >> >> wrote:
> > >> >> >>
> > >> >> >> Hi Group,
> > >> >> >>
> > >> >> >> I am checking HBase/HDFS fail over. I am inserting 1M
records
> from
> > my
> > >> >> HBase
> > >> >> >> client application. I am clubbing my Put operation such
that 10
> > >> records
> > >> >> get
> > >> >> >> added into the List<Put> and then I call the table.put().
I have
> > not
> > >> >> >> modified the default setting of Put operation which means
all
> data
> > is
> > >> >> >> written in WAL and in case of server failure my data
should not
> be
> > >> lost.
> > >> >> >>
> > >> >> >> But I noticed somewhat strange behavior, while adding
records if
> I
> > >> kill
> > >> >> my
> > >> >> >> Region Server then my application waits till the time
region
> data
> > is
> > >> >> moved
> > >> >> >> to another region. But I noticed while doing so all my
data is
> > lost
> > >> and
> > >> >> my
> > >> >> >> table is emptied.
> > >> >> >>
> > >> >> >> Could you help me understand the behavior. Is there some
kind of
> > >> Cache
> > >> >> also
> > >> >> >> involved while writing because of which my data is lost.
> > >> >> >>
> > >> >> >>
> > >> >> >> Thanks,
> > >> >> >> Gagan
> > >> >> >>
> > >> >> >
> > >> >>
> > >> >
> > >>
> > >
> >
>



-- 
Todd Lipcon
Software Engineer, Cloudera

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message