hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dave Latham <lat...@davelink.net>
Subject Re: Changing it so we do NOT archive hfiles by default
Date Mon, 24 Nov 2014 17:51:34 GMT
Even with the manifest feature, it is still inefficient to iterate over
every hfile in the archive and check with the namenode to see if there is a
new snapshot manifest that may reference that single hfile rather than
doing that check a single time for the list of all archived hfiles.

On Sat, Nov 22, 2014 at 3:08 PM, lars hofhansl <larsh@apache.org> wrote:

> Hit send to fast.
> I meant to say: "Actually is HBASE-11360 still needed when we have
> manifests?"
>       From: lars hofhansl <larsh@apache.org>
>  To: "dev@hbase.apache.org" <dev@hbase.apache.org>; lars hofhansl <
> larsh@apache.org>
>  Sent: Saturday, November 22, 2014 2:58 PM
>  Subject: Re: Changing it so we do NOT archive hfiles by default
>
> Actually in HBASE-11360 when we have manifests? The problem was scanning
> all those reference files, now those are all replaced with a manifest so
> maybe this is not a problem.-- Lars
>
>       From: lars hofhansl <larsh@apache.org>
>
>
>  To: "dev@hbase.apache.org" <dev@hbase.apache.org>
>  Sent: Saturday, November 22, 2014 2:41 PM
>  Subject: Re: Changing it so we do NOT archive hfiles by default
>
> I agree. I did not realize we undid HBASE-11360. Offhand I see no reason
> why it had to be rolled back completely, rather than being adapted.We need
> to bring that functionality back.
> -- Lars
>       From: Dave Latham <latham@davelink.net>
>
>
>  To: dev@hbase.apache.org
>  Sent: Saturday, November 22, 2014 7:04 AM
>  Subject: Re: Changing it so we do NOT archive hfiles by default
>
> If no snapshots are enabled, then I'll definitely be curious to hear more
> on the cause of not keeping up.
> I also think it's reasonable to delete files directly if there is no use
> for them in the archive.
>
> However, HBase does need to be able to handle large scale archive cleaning
> for those who are using archive based features.  One important way is
> processing the checks in batches rather than one at a time.  For
> HBASE-11360 as an example, even with the manifest file there's no reason we
> can't still check the archive files against the manifests in batches rather
> than reverting it to one at a time - that part of the fix is compatible and
> still important.
>
> I hope you guys are able to get past the issue for this cluster but that we
> can also address it at large.
>
>
>
> On Fri, Nov 21, 2014 at 3:16 PM, Esteban Gutierrez <esteban@cloudera.com>
> wrote:
>
> > For the specific case Stack mentioned here there are no snapshots enabled
> > and its an 0.94.x release so no real need for this user to have the
> archive
> > enabled. I've also seen this issue on 0.98 on with a busy NN (deletions
> > pile up)
> >
> > I think it should be fine to fall back to the old behavior if snapshots
> are
> > not being used and delete compacted files or HFiles from a dropped table
> > immediately.
> >
> > One problem with HBASE-11360 was to maintain a better compatibility with
> > snapshots in the current they work in branch-1 with the manifest file.
> >
> > cheers,
> > esteban.
> >
> >
> >
> >
> > --
> > Cloudera, Inc.
> >
> >
> > On Fri, Nov 21, 2014 at 2:50 PM, Dave Latham <latham@davelink.net>
> wrote:
> >
> > > Yes, there were definitely issues with the way the file cleaners worked
> > > where it ends up doing NameNode lookups or even scans for every single
> > file
> > > before the cleaner allows it to be removed.
> > > What version of HBase are these folks using?  Do they have snapshots
> > > enabled?
> > > Here's tickets where we fixed a couple of these performance issues
> where
> > > the master just could not keep up at large scale:
> > > HBASE-9208 for slow ReplicationLogCleaner and HBASE-11360 for usage of
> > the
> > > SnapshotFileCache.
> > > The fixes were generally to check batches of files at a time instead of
> > > hitting the NameNode for every file.
> > >
> > > I'm sorry to see that HBASE-11360 was reverted with HBASE-11742 so if
> > > snapshots are enabled that could be the same issue.
> > >
> > > I'd be sad to see the solution be that you can't both have snapshots or
> > > backups and operate at large scale.
> > >
> > > Dave
> > >
> > > On Thu, Nov 20, 2014 at 12:42 PM, Stack <stack@duboce.net> wrote:
> > >
> > > > On Thu, Nov 20, 2014 at 12:21 PM, lars hofhansl <larsh@apache.org>
> > > wrote:
> > > >
> > > > > Interesting that removing the files (which is just a metadata
> > operation
> > > > in
> > > > > the NN) is slower than writing the files with all their data in the
> > > first
> > > > > place.Is it really the NN that is the gating factor or is it the
> > > > algorithm
> > > > > we have in HBase? I remember we had similar issue with the HLog
> > removal
> > > > > where we rescan the WAL directory over and over for no good reason,
> > and
> > > > the
> > > > > nice guys from Flurry did a fix.
> > > > >
> > > >
> > > > There is yes, an open question on why the cleaner can not keep up. Am
> > > > looking into this too (High level, millions of files in a single dir)
> > > >
> > > >
> > > >
> > > > > We have a lot of stuff relying on this now, so it should be done
> > > > > carefully. You thinking 1.0+, or even earlier releases?
> > > > >
> > > > >
> > > > Yes. It seems a bunch of items have come to rely on this behavior
> since
> > > it
> > > > was introduced way back. Was thinking 1.0, yes, but after the input
> > here
> > > > and offlist by Matteo, my hope of an easy fix has taken a dent.
> > > >
> > > > Thanks for the input lads,
> > > > St.Ack
> > > >
> > > >
> > > >
> > > >
> > > > > -- Lars
> > > > >      From: Stack <stack@duboce.net>
> > > > >  To: HBase Dev List <dev@hbase.apache.org>
> > > > >  Sent: Thursday, November 20, 2014 11:08 AM
> > > > >  Subject: Changing it so we do NOT archive hfiles by default
> > > > >
> > > > > I think we should swap the default that has us archive hfiles
> rather
> > > than
> > > > > just outright delete them when we are done with them. The current
> > > > > configuration works for the minority of us who are running backup
> > > tools.
> > > > > For the rest of us, our clusters are doing unnecessary extra work.
> > > > >
> > > > > Background:
> > > > >
> > > > > Since 0.94 (https://issues.apache.org/jira/browse/HBASE-5547),
> when
> > we
> > > > are
> > > > > done with an hfile, it is moved to the 'archive' (hbase/.archive)
> > > > > directory. A thread in the master then removes hfiles older than
> some
> > > > > configured time. We do this rather than just delete hfiles to
> > > facilitate
> > > > > backup tools -- let backup tools have a say in when an hfile is
> safe
> > to
> > > > > remove.
> > > > >
> > > > > The subject on HBASE-5547 has it that the archiving behavior only
> > > happens
> > > > > when the cluster is in 'backup mode', but as it turns out, later
in
> > the
> > > > > issue discussion, the implementation becomes significantly easier
> if
> > we
> > > > > just always archive and that is what we ended up implementing and
> > > > > committing.
> > > > >
> > > > > These last few days, a few of us have been helping a user on a
> large
> > > > > cluster who is (temporarily) doing loads of compactions with the
> > > replaced
> > > > > hfiles being moved to hbase/.archive. The cleaning thread in master
> > is
> > > > not
> > > > > working fast enough deleting the hfiles so there is buildup going
> on
> > --
> > > > so
> > > > > much so, its slowing the whole cluster down (NN operations over
> tens
> > of
> > > > > millions of files).
> > > > >
> > > > > Any problem swapping the default and having users opt-in for
> > archiving?
> > > > > (I'd leave it as is in released software).  I will also take a look
> > at
> > > > > having the cleaner thread do more work per cycle.
> > > > >
> > > > > Thanks,
> > > > > St.Ack
> > > > >
> > > > >
> > > > >
> > > > >
> > > >
> > >
> >
>
>
>
>
>
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message