hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matteo Bertozzi <theo.berto...@gmail.com>
Subject Re: Snapshot clone error
Date Fri, 31 Jan 2014 15:46:18 GMT
thanks for the confirmation.

can you try to export the snapshot again and keep the log file if the
result of the export will be broken again?
Thanks!

Matteo



On Fri, Jan 31, 2014 at 3:43 PM, David Koch <ogdude@googlemail.com> wrote:

> Actually, I just noticed - the snapshot on the source cluster ok, it's the
> exported snapshot on the destination cluster that's corrupted.
>
>
> On Fri, Jan 31, 2014 at 4:40 PM, David Koch <ogdude@googlemail.com> wrote:
>
> > Thanks for your reply,
> >
> > As a matter of fact when running with the "-files" option it turns out a
> > lot of files are missing from the snapshot which I did not manage to
> > restore. It's possible that hbck was run during snapshotting.
> >
> > **************************************************************
> > BAD SNAPSHOT: 6659 hfile(s) and 0 log(s) missing.
> > **************************************************************
> > 78 HFiles (78 in archive), total size 14.3 G (0.00% 0 shared with the
> > source table)
> > 0 Logs, total size 0
> >
> > 78 files is exactly the number of regions that I found after attempting
> > restoration.
> >
> > We followed standard procedure as described in the manual:
> > http://hbase.apache.org/book/ops.snapshots.html
> >
> > I will try again and make sure no hbchk is intervening.
> >
> > /David
> >
> >
> > On Fri, Jan 31, 2014 at 4:20 PM, Matteo Bertozzi <
> theo.bertozzi@gmail.com>wrote:
> >
> >> you should use SnapshotInfo with the "-files" options and you'll
> probably
> >> see that one snapshot is corrupted.
> >> in HBase 0.94.15/CDH 4.6 there will be a fix (HBASE-10111) that will
> >> prevent to restore/clone a corrupted snapshot.
> >>
> >> a corrupted snapshot means that some file contained in the snapshot is
> >> missing from the .archive
> >> that situation may happen if you have removed files by hand, or you run
> >> hbck that sidelined the files or similar
> >> (unless there is a bug somewhere)
> >> do you remember the steps that you followed? did you use ExportSnapshot?
> >> did you moved the files by hand to another cluster or similar?
> >>
> >> Offline or Online snapshot shouldn't make difference, the corruption is
> >> probably happened after taking the snapshot.
> >> You can retry taking the snapshot, and periodically run SnapshotInfo
> with
> >> the -files options to verify the state and post the logs in case you
> get a
> >> corruption again.
> >>
> >> Matteo
> >>
> >>
> >>
> >> On Fri, Jan 31, 2014 at 3:10 PM, David Koch <ogdude@googlemail.com>
> >> wrote:
> >>
> >> > Matteo,
> >> >
> >> > Thank you for your reply. All clients, servers are using the same
> >> version:
> >> >
> >> > 14/01/31 16:06:20 INFO util.VersionInfo: HBase 0.94.6-cdh4.5.0
> >> >
> >> > Also, the information generated by:
> >> >
> >> > hbase org.apache.hadoop.hbase.snapshot.SnapshotInfo -snapshot
> >> >
> >> > is identical for snapshots which I managed to clone and those for
> which
> >> the
> >> > cloning/restoration failed. Would you advise re-trying snaphotting the
> >> > table while it is disabled? Otherwise I'll go with old-fashioned
> >> CopyTable
> >> > or re-import into HBase from HDFS files.
> >> >
> >> > Thank you,
> >> >
> >> > /David
> >> >
> >> >
> >> > On Fri, Jan 31, 2014 at 2:42 PM, Matteo Bertozzi <
> >> theo.bertozzi@gmail.com
> >> > >wrote:
> >> >
> >> > > the snapshot seems to be corrupted, which version are you running?
> >> > >
> >> > > Matteo
> >> > >
> >> > >
> >> > >
> >> > > On Fri, Jan 31, 2014 at 1:06 PM, David Koch <ogdude@googlemail.com>
> >> > wrote:
> >> > >
> >> > > > Hello,
> >> > > >
> >> > > > We export an online snapshot of a table to a different cluster,
> when
> >> > > > attempting a clone on the destination cluster using:
> >> > > >
> >> > > > clone_snapshot 'table_source_snapshot', 'table_dest'
> >> > > >
> >> > > > it does not work.
> >> > > >
> >> > > > The operation times out after a a while
> >> > > >
> >> > > > ERROR: java.io.IOException: Table 'table_dest' not yet enabled,
> >> after
> >> > > > 1996939ms.
> >> > > >
> >> > > > and I see only a fraction of the number of regions in the
> >> destination
> >> > > > table. Table is indicated as "enabled" but I cannot perform any
> >> scans
> >> > on
> >> > > > it.
> >> > > >
> >> > > > The snapshot info returns the following:
> >> > > >
> >> > > > Snapshot Info
> >> > > > ----------------------------------------
> >> > > >    Name: table_source_snapshot
> >> > > >    Type: FLUSH
> >> > > >   Table: table_source
> >> > > >  Format: 0
> >> > > > Created: 2014-01-30T13:05:02
> >> > > >
> >> > > > Snapshot seems to be intact. What could be the error? Should
I
> take
> >> an
> >> > > > offline snapshot instead? Going via restore/enable instead of
> clone
> >> > does
> >> > > > not seem to work either.
> >> > > >
> >> > > > Also, I see the following in the region servers:
> >> > > >
> >> > > > 2:24:35.807 PM ERROR
> >> > > > org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler
> >> > > > Failed open of
> >> > > > region=
> >> > > >
> >> > >
> >> >
> >>
> table_source,\x82\x12Y\x00\xE98C\xEE\xBC\xCC\xE3h\xDAPt\xA6,1366259070788.63ca017ac7cd03e68c35a4da8b56421d.,
> >> > > > starting to roll back the global memstore size.
> >> > > > java.io.IOException: java.io.IOException:
> >> > java.io.FileNotFoundException:
> >> > > > Unable to open link: org.apache.hadoop.hbase.io.HFileLink
> >> > > >
> >> > > >
> >> > >
> >> >
> >>
> locations=[hdfs://nameservice1/hbase/table_source/816bb88c6f3524a877f4cb7ce747fec1/t/c3b37dc11e684626a5b464a25a75735c,
> >> > > >
> >> > > >
> >> > >
> >> >
> >>
> hdfs://nameservice1/hbase/.tmp/table_source/816bb88c6f3524a877f4cb7ce747fec1/t/c3b37dc11e684626a5b464a25a75735c,
> >> > > >
> >> > > >
> >> > >
> >> >
> >>
> hdfs://nameservice1/hbase/.archive/table_source/816bb88c6f3524a877f4cb7ce747fec1/t/c3b37dc11e684626a5b464a25a75735c]
> >> > > >
> >> > > > None of these parts actually exist, however:
> >> > > >
> >> > > >
> >> > >
> >> >
> >>
> hdfs://nameservice1/hbase/.snapshot/table_source_snapshot/816bb88c6f3524a877f4cb7ce747fec1/t/c3b37dc11e684626a5b464a25a75735c
> >> > > > does exist.
> >> > > >
> >> > > > I don't think that's the issue though, since I applied the same
> >> steps
> >> > to
> >> > > a
> >> > > > smaller table and it worked.
> >> > > >
> >> > > > Any advice is appreciated,
> >> > > >
> >> > > > Regards,
> >> > > >
> >> > > > /David
> >> > > >
> >> > >
> >> >
> >>
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message