Mailing-List: contact dev-help@hbase.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@hbase.apache.org
Message-ID: <581B96F5.4040507@apache.org>
Date: Thu, 03 Nov 2016 15:58:45 -0400
From: Josh Elser <elserj@apache.org>
User-Agent: Postbox 3.0.11 (Macintosh/20140602)
MIME-Version: 1.0
To: dev@hbase.apache.org
Subject: Re: [DISCUSS] FileSystem Quotas in HBase
References: <5813762F.2000307@apache.org> <CALte62yE-k7azJ0ryjcLodCKrchyPbxL_Ovywt4qTZ1V9ZG-VA@mail.gmail.com> <58137C29.8040700@apache.org> <581A3B6D.9010504@apache.org> <CAMUu0w8hD-JTLTjR0ONNZnWJUTBJGOz5adeYJZr4xrL4jE=-0w@mail.gmail.com> <CA+RK=_BcumaYibPccDPFrBjFTEz8KSNEggTLEexuH82aCbSOSg@mail.gmail.com> <CAMUu0w_jn4KLMmSqeash3e8iqq8cLgUhSavZyQv7xXttYKnF8g@mail.gmail.com> <CALte62xaBF2_7jPo8hrSPfa85DEXjusoQ7eSvivFPZZRR-togw@mail.gmail.com>
In-Reply-To: <CALte62xaBF2_7jPo8hrSPfa85DEXjusoQ7eSvivFPZZRR-togw@mail.gmail.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
archived-at: Thu, 03 Nov 2016 19:58:55 -0000

Done.

Ted Yu wrote:
> Josh:
> Please capture the following in design doc.
>
> Thanks
>
> On Wed, Nov 2, 2016 at 3:28 PM, Enis Söztutar<enis.soz@gmail.com>  wrote:
>
>> Thanks Andrew,
>>
>> I forgot to mention that we have considered using the HDFS quota
>> enforcement directly as well, but decided against it for a couple of
>> reasons.
>>   - Our current layout has files in the data directory, as well as archive
>> directory and WALs, etc. Since there is no option for HDFS quotas to span
>> multiple directories, we can only use the HDFS quotas for main data files,
>> and not snapshots, etc unless we do major surgery in our file layouts. This
>> will get more complicated if we want to do flat layout, etc later on.
>>   - Since WALs would not be in any namespace unless we do wal-per-namespace,
>> that means that once a single NS's HDFS quota is reached, it might affect
>> everybody else and potentially cause havoc on the cluster. The problem
>> would be that if a single NS is out of space, we cannot perform flushes at
>> all. This would cause the WALs to be backed up and kept forever and affect
>> all of the other regions from different tables / namespaces causing
>> unavailability for unrelated tables. Wal-per-namespace also has to be
>> implemented and WALs be moved under a shared NS directory to share the data
>> and WAL requiring further layout changes. It also will not be optimal if
>> there is a large number of namespaces.
>>   - Will only work with HDFS, while HBase can use other file systems.
>>
>> Enis
>>
>> On Wed, Nov 2, 2016 at 3:01 PM, Andrew Purtell<apurtell@apache.org>
>> wrote:
>>
>>> Another approach to hard limits could be pushing the quota down to the
>> HDFS
>>> level, because HDFS would have a very accurate assessment of quota
>>> utilization at all times, but this would only work with HDFS and impose
>>> limits on how HBase structures storage on the filesystem (e.g. all files
>>> for a namespace must be under a common root). Still, implementation would
>>> be "easy": over hard quota, all allocations would fail, the bulk of the
>>> effort is hardening response to allocation failures.
>>>
>>> On Wed, Nov 2, 2016 at 1:11 PM, Enis Söztutar<enis@apache.org>  wrote:
>>>
>>>> Thanks Josh for the doc and pursuing this.
>>>>
>>>> I was involved with some of the design choices so consider me a +1 on
>> the
>>>> general approach. One topic which is not covered here is that the other
>>>> design decision that we could have pursued is a more strict control on
>>> the
>>>> quota usage so that we would always guarantee that the namespace /
>> table
>>>> cannot use more than allocated disk space. This hard-limit approach
>> would
>>>> differ from the proposed "soft-limit" approach because the soft limit
>>>> approach can end up overusing the disk space by a small amount (because
>>> it
>>>> takes time to detect the quota limit is reached and enforcing of the
>>>> limit).
>>>>
>>>> The hard-limit approach maybe built by doing a lease kind of mechanism
>>>> where the master gives away disk space leases to region servers from
>> the
>>>> remaining limit, and the regionservers make sure that they cannot
>>> allocate
>>>> more space than the lease dictates. By ensuring that the space is
>>>> pre-allocated via leases, we can always make sure that strict limits
>> are
>>>> applied. Though, this approach would be harder to build and stabilize
>>>> because it will need new mechanisms for distributing and managing this
>>> kind
>>>> of leases as well as tuning the allocations to make sure that
>>> regionservers
>>>> never block flushes or compactions due to lack of lease in time would
>>> prove
>>>> challenging to get it right.
>>>>
>>>> We generally think that the "soft-limit" approach would be a good
>> enough
>>>> approximation and the error bounds on over-allocation would be minimal
>>> and
>>>> negligible in production.  Thus, the proposal is to implement the soft
>>>> approach with good documentation about how much space can be
>>> over-allocated
>>>> in a worst-case scenario.
>>>>
>>>> Enis
>>>>
>>>> On Wed, Nov 2, 2016 at 12:15 PM, Josh Elser<elserj@apache.org>  wrote:
>>>>
>>>>> Thanks for the reviews so far, Ted and Stack. The comments were great
>>> and
>>>>> much appreciated.
>>>>>
>>>>> Interpreting consensus from lack of objection, I'm going to move
>> ahead
>>> in
>>>>> earnest starting to work on what was described in the doc. Expect to
>>> see
>>>>> some work break-out happening under HBASE-16961 and patches starting
>> to
>>>>> land.
>>>>>
>>>>> I'm also happy to entertain more discussion if anyone hasn't found
>> the
>>>>> time to read/comment yet.
>>>>>
>>>>> Thanks!
>>>>>
>>>>> - Josh
>>>>>
>>>>>
>>>>> Josh Elser wrote:
>>>>>
>>>>>> Sure thing, Ted.
>>>>>>
>>>>>> https://docs.google.com/document/d/1VtLWDkB2tpwc_zgCNPE1ulZO
>>>>>> eecF-YA2FYSK3TSs_bw/edit?usp=sharing
>>>>>>
>>>>>>
>>>>>> Let me open an umbrella issue for now. I can break up the work
>> later.
>>>>>> https://issues.apache.org/jira/browse/HBASE-16961
>>>>>>
>>>>>> Ted Yu wrote:
>>>>>>
>>>>>>> Josh:
>>>>>>> Can you put the doc in google doc so that people can comment on it
>> ?
>>>>>>> Is there a JIRA opened for this work ?
>>>>>>> Please open one if there is none.
>>>>>>>
>>>>>>> Thanks
>>>>>>>
>>>>>>> On Fri, Oct 28, 2016 at 9:00 AM, Josh Elser<elserj@apache.org>
>>> wrote:
>>>>>>> Hi folks,
>>>>>>>> I'd like to propose the introduction of FileSystem quotas to
>> HBase.
>>>>>>>> Here's a design doc[1] available which (hopefully) covers all of
>> the
>>>>>>>> salient points of what I think an initial version of such a
>> feature
>>>>>>>> would
>>>>>>>> include.
>>>>>>>>
>>>>>>>> tl;dr We can define quotas on tables and namespaces. Region size
>> is
>>>>>>>> computed by RegionServers and sent to the Master. The Master
>>> inspects
>>>>>>>> the
>>>>>>>> sizes of Regions, rolling up to table and namespace sizes. Defined
>>>>>>>> quotas
>>>>>>>> in the quota table are evaluated given the computed sizes, and,
>> for
>>>>>>>> those
>>>>>>>> tables/namespaces violating the quota, RegionServers are informed
>> to
>>>>>>>> take
>>>>>>>> some action to limit any further filesystem growth by that
>>>>>>>> table/namespace.
>>>>>>>>
>>>>>>>> I'd encourage you to give the document a read -- I tried to cover
>> as
>>>>>>>> much
>>>>>>>> as I could without getting unnecessarily bogged down in
>>> implementation
>>>>>>>> details.
>>>>>>>>
>>>>>>>> Feedback is, of course, welcomed. I'd like to start sketching out
>> a
>>>>>>>> breakdown of the work (all writing and no programming makes Josh a
>>> sad
>>>>>>>> boy). I'm happy to field any/all questions. Thanks in advance.
>>>>>>>>
>>>>>>>> - Josh
>>>>>>>>
>>>>>>>> [1] http://home.apache.org/~elserj/hbase/FileSystemQuotasforApac
>>>>>>>> heHBase.pdf
>>>>>>>>
>>>>>>>>
>>>
>>>
>>> --
>>> Best regards,
>>>
>>>     - Andy
>>>
>>> Problems worthy of attack prove their worth by hitting back. - Piet Hein
>>> (via Tom White)
>>>
>