Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id F2DE9200BB1 for ; Thu, 3 Nov 2016 20:58:54 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id F1761160AFF; Thu, 3 Nov 2016 19:58:54 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 1B943160AE5 for ; Thu, 3 Nov 2016 20:58:53 +0100 (CET) Received: (qmail 14393 invoked by uid 500); 3 Nov 2016 19:58:48 -0000 Mailing-List: contact dev-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hbase.apache.org Delivered-To: mailing list dev@hbase.apache.org Received: (qmail 14380 invoked by uid 99); 3 Nov 2016 19:58:48 -0000 Received: from mail-relay.apache.org (HELO mail-relay.apache.org) (140.211.11.15) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 03 Nov 2016 19:58:48 +0000 Received: from hw10447.local (207.155.208.210.ptr.us.xo.net [207.155.208.210]) by mail-relay.apache.org (ASF Mail Server at mail-relay.apache.org) with ESMTPSA id C0D711A00C5 for ; Thu, 3 Nov 2016 19:58:47 +0000 (UTC) Message-ID: <581B96F5.4040507@apache.org> Date: Thu, 03 Nov 2016 15:58:45 -0400 From: Josh Elser User-Agent: Postbox 3.0.11 (Macintosh/20140602) MIME-Version: 1.0 To: dev@hbase.apache.org Subject: Re: [DISCUSS] FileSystem Quotas in HBase References: <5813762F.2000307@apache.org> <58137C29.8040700@apache.org> <581A3B6D.9010504@apache.org> In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit archived-at: Thu, 03 Nov 2016 19:58:55 -0000 Done. Ted Yu wrote: > Josh: > Please capture the following in design doc. > > Thanks > > On Wed, Nov 2, 2016 at 3:28 PM, Enis Söztutar wrote: > >> Thanks Andrew, >> >> I forgot to mention that we have considered using the HDFS quota >> enforcement directly as well, but decided against it for a couple of >> reasons. >> - Our current layout has files in the data directory, as well as archive >> directory and WALs, etc. Since there is no option for HDFS quotas to span >> multiple directories, we can only use the HDFS quotas for main data files, >> and not snapshots, etc unless we do major surgery in our file layouts. This >> will get more complicated if we want to do flat layout, etc later on. >> - Since WALs would not be in any namespace unless we do wal-per-namespace, >> that means that once a single NS's HDFS quota is reached, it might affect >> everybody else and potentially cause havoc on the cluster. The problem >> would be that if a single NS is out of space, we cannot perform flushes at >> all. This would cause the WALs to be backed up and kept forever and affect >> all of the other regions from different tables / namespaces causing >> unavailability for unrelated tables. Wal-per-namespace also has to be >> implemented and WALs be moved under a shared NS directory to share the data >> and WAL requiring further layout changes. It also will not be optimal if >> there is a large number of namespaces. >> - Will only work with HDFS, while HBase can use other file systems. >> >> Enis >> >> On Wed, Nov 2, 2016 at 3:01 PM, Andrew Purtell >> wrote: >> >>> Another approach to hard limits could be pushing the quota down to the >> HDFS >>> level, because HDFS would have a very accurate assessment of quota >>> utilization at all times, but this would only work with HDFS and impose >>> limits on how HBase structures storage on the filesystem (e.g. all files >>> for a namespace must be under a common root). Still, implementation would >>> be "easy": over hard quota, all allocations would fail, the bulk of the >>> effort is hardening response to allocation failures. >>> >>> On Wed, Nov 2, 2016 at 1:11 PM, Enis Söztutar wrote: >>> >>>> Thanks Josh for the doc and pursuing this. >>>> >>>> I was involved with some of the design choices so consider me a +1 on >> the >>>> general approach. One topic which is not covered here is that the other >>>> design decision that we could have pursued is a more strict control on >>> the >>>> quota usage so that we would always guarantee that the namespace / >> table >>>> cannot use more than allocated disk space. This hard-limit approach >> would >>>> differ from the proposed "soft-limit" approach because the soft limit >>>> approach can end up overusing the disk space by a small amount (because >>> it >>>> takes time to detect the quota limit is reached and enforcing of the >>>> limit). >>>> >>>> The hard-limit approach maybe built by doing a lease kind of mechanism >>>> where the master gives away disk space leases to region servers from >> the >>>> remaining limit, and the regionservers make sure that they cannot >>> allocate >>>> more space than the lease dictates. By ensuring that the space is >>>> pre-allocated via leases, we can always make sure that strict limits >> are >>>> applied. Though, this approach would be harder to build and stabilize >>>> because it will need new mechanisms for distributing and managing this >>> kind >>>> of leases as well as tuning the allocations to make sure that >>> regionservers >>>> never block flushes or compactions due to lack of lease in time would >>> prove >>>> challenging to get it right. >>>> >>>> We generally think that the "soft-limit" approach would be a good >> enough >>>> approximation and the error bounds on over-allocation would be minimal >>> and >>>> negligible in production. Thus, the proposal is to implement the soft >>>> approach with good documentation about how much space can be >>> over-allocated >>>> in a worst-case scenario. >>>> >>>> Enis >>>> >>>> On Wed, Nov 2, 2016 at 12:15 PM, Josh Elser wrote: >>>> >>>>> Thanks for the reviews so far, Ted and Stack. The comments were great >>> and >>>>> much appreciated. >>>>> >>>>> Interpreting consensus from lack of objection, I'm going to move >> ahead >>> in >>>>> earnest starting to work on what was described in the doc. Expect to >>> see >>>>> some work break-out happening under HBASE-16961 and patches starting >> to >>>>> land. >>>>> >>>>> I'm also happy to entertain more discussion if anyone hasn't found >> the >>>>> time to read/comment yet. >>>>> >>>>> Thanks! >>>>> >>>>> - Josh >>>>> >>>>> >>>>> Josh Elser wrote: >>>>> >>>>>> Sure thing, Ted. >>>>>> >>>>>> https://docs.google.com/document/d/1VtLWDkB2tpwc_zgCNPE1ulZO >>>>>> eecF-YA2FYSK3TSs_bw/edit?usp=sharing >>>>>> >>>>>> >>>>>> Let me open an umbrella issue for now. I can break up the work >> later. >>>>>> https://issues.apache.org/jira/browse/HBASE-16961 >>>>>> >>>>>> Ted Yu wrote: >>>>>> >>>>>>> Josh: >>>>>>> Can you put the doc in google doc so that people can comment on it >> ? >>>>>>> Is there a JIRA opened for this work ? >>>>>>> Please open one if there is none. >>>>>>> >>>>>>> Thanks >>>>>>> >>>>>>> On Fri, Oct 28, 2016 at 9:00 AM, Josh Elser >>> wrote: >>>>>>> Hi folks, >>>>>>>> I'd like to propose the introduction of FileSystem quotas to >> HBase. >>>>>>>> Here's a design doc[1] available which (hopefully) covers all of >> the >>>>>>>> salient points of what I think an initial version of such a >> feature >>>>>>>> would >>>>>>>> include. >>>>>>>> >>>>>>>> tl;dr We can define quotas on tables and namespaces. Region size >> is >>>>>>>> computed by RegionServers and sent to the Master. The Master >>> inspects >>>>>>>> the >>>>>>>> sizes of Regions, rolling up to table and namespace sizes. Defined >>>>>>>> quotas >>>>>>>> in the quota table are evaluated given the computed sizes, and, >> for >>>>>>>> those >>>>>>>> tables/namespaces violating the quota, RegionServers are informed >> to >>>>>>>> take >>>>>>>> some action to limit any further filesystem growth by that >>>>>>>> table/namespace. >>>>>>>> >>>>>>>> I'd encourage you to give the document a read -- I tried to cover >> as >>>>>>>> much >>>>>>>> as I could without getting unnecessarily bogged down in >>> implementation >>>>>>>> details. >>>>>>>> >>>>>>>> Feedback is, of course, welcomed. I'd like to start sketching out >> a >>>>>>>> breakdown of the work (all writing and no programming makes Josh a >>> sad >>>>>>>> boy). I'm happy to field any/all questions. Thanks in advance. >>>>>>>> >>>>>>>> - Josh >>>>>>>> >>>>>>>> [1] http://home.apache.org/~elserj/hbase/FileSystemQuotasforApac >>>>>>>> heHBase.pdf >>>>>>>> >>>>>>>> >>> >>> >>> -- >>> Best regards, >>> >>> - Andy >>> >>> Problems worthy of attack prove their worth by hitting back. - Piet Hein >>> (via Tom White) >>> >