Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id DC161200C4B for ; Mon, 6 Mar 2017 06:32:38 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id DAA7E160B7D; Mon, 6 Mar 2017 05:32:38 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 30953160B6B for ; Mon, 6 Mar 2017 06:32:38 +0100 (CET) Received: (qmail 4785 invoked by uid 500); 6 Mar 2017 05:32:37 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 4774 invoked by uid 99); 6 Mar 2017 05:32:37 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 06 Mar 2017 05:32:37 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 80CAFC103A for ; Mon, 6 Mar 2017 05:32:36 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.451 X-Spam-Level: * X-Spam-Status: No, score=1.451 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, RP_MATCHES_RCVD=-0.001, SPF_NEUTRAL=0.652] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id 3O8rOlTDPX1L for ; Mon, 6 Mar 2017 05:32:35 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id B70DE60E08 for ; Mon, 6 Mar 2017 05:32:35 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 0DC1EE04FE for ; Mon, 6 Mar 2017 05:32:35 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 5E8052419A for ; Mon, 6 Mar 2017 05:32:33 +0000 (UTC) Date: Mon, 6 Mar 2017 05:32:33 +0000 (UTC) From: "Xiang Li (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HBASE-14090) Redo FS layout; let go of tables/regions/stores directory hierarchy in DFS MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Mon, 06 Mar 2017 05:32:39 -0000 [ https://issues.apache.org/jira/browse/HBASE-14090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15896763#comment-15896763 ] Xiang Li commented on HBASE-14090: ---------------------------------- Thanks Umesh, got your idea. > Redo FS layout; let go of tables/regions/stores directory hierarchy in DFS > -------------------------------------------------------------------------- > > Key: HBASE-14090 > URL: https://issues.apache.org/jira/browse/HBASE-14090 > Project: HBase > Issue Type: Sub-task > Reporter: stack > Assignee: Sean Busbey > > Our layout as is won't work if 1M regions; e.g. HDFS will fall over if directories of hundreds of thousands of files. HBASE-13991 (Humongous Tables) would address this specific directory problem only by adding subdirs under table dir but there are other issues with our current layout: > * Our table/regions/column family 'facade' has to be maintained in two locations -- in master memory and in the hdfs directory layout -- and the farce needs to be kept synced or worse, the model management is split between master memory and DFS layout. 'Syncing' in HDFS has us dropping constructs such as 'Reference' and 'HalfHFiles' on split, 'HFileLinks' when archiving, and so on. This 'tie' makes it hard to make changes. > * While HDFS has atomic rename, useful for fencing and for having files added atomically, if the model were solely owned by hbase, there are hbase primitives we could make use of -- changes in a row are atomic and coprocessors -- to simplify table transactions and provide more consistent views of our model to clients; file 'moves' could be a memory operation only rather than an HDFS call; sharing files between tables/snapshots and when it is safe to remove them would be simplified if one owner only; and so on. > This is an umbrella blue-sky issue to discuss what a new layout would look like and how we might get there. I'll follow up with some sketches of what new layout could look like that come of some chats a few of us have been having. We are also under the 'delusion' that move to a new layout could be done as part of a rolling upgrade and that the amount of work involved is not gargantuan. -- This message was sent by Atlassian JIRA (v6.3.15#6346)