Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 7009E200D06 for ; Mon, 25 Sep 2017 14:13:06 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 6B7711609E9; Mon, 25 Sep 2017 12:13:06 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 8A1151609BB for ; Mon, 25 Sep 2017 14:13:05 +0200 (CEST) Received: (qmail 61141 invoked by uid 500); 25 Sep 2017 12:13:04 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 61130 invoked by uid 99); 25 Sep 2017 12:13:04 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 25 Sep 2017 12:13:04 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 1BD87182E85 for ; Mon, 25 Sep 2017 12:13:04 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -100.002 X-Spam-Level: X-Spam-Status: No, score=-100.002 tagged_above=-999 required=6.31 tests=[RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id LkWr5BPKNffC for ; Mon, 25 Sep 2017 12:13:02 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id EDF485FB0B for ; Mon, 25 Sep 2017 12:13:01 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 75B17E01A8 for ; Mon, 25 Sep 2017 12:13:01 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 242A924215 for ; Mon, 25 Sep 2017 12:13:00 +0000 (UTC) Date: Mon, 25 Sep 2017 12:13:00 +0000 (UTC) From: "Vishal Khandelwal (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (HBASE-18872) Backup scaling for multiple table and millions of row MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Mon, 25 Sep 2017 12:13:06 -0000 [ https://issues.apache.org/jira/browse/HBASE-18872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vishal Khandelwal updated HBASE-18872: -------------------------------------- Description: I did a simple experiment of loading ~200 million rows on a table 1 and nothing in a table 2. This test was done on a local cluster ~ approx 3-4 containers were running in parallel. The focus of the test was not on how much time backup takes but on time spent on the table were no data has been changed. *Table without Data -->* Elapsed: 44mins, 52sec Average Map Time 3sec Average Shuffle Time 2mins, 35sec Average Merge Time 0sec Average Reduce Time 0sec Map : 2052 Reduce : 1 *Table with Data -->* Elapsed: 1hrs, 44mins, 10sec Average Map Time 4sec Average Shuffle Time 37sec Average Merge Time 3sec Average Reduce Time 47sec Map : 2052 Reduce : 64 All above numbers are a single node cluster so not many mappers run in parallel. but let's extrapolate this to 20 node cluster, with ~100 tables and data size to be backed up various for approx 2000 Wals, let us say each 20 node can process 3 containers i.e 60 wals in parallel. assume 3 sec are spent in each WALs i.e. 6000\ 60 sec --> 100 per table --> 10000 sec for all tables. ~166 mins --> ~2.7 only for filtering. This does not seem to be scale. (These are just rough numbers from a basic test). As all parsing is O (m (WALS) * n (Tables)) Main intend of this test is to see even the backup of very less churning table might take good amount for just filtering the data. As number of table or data increases, this does not seem scalable Even i can see from our current cluster numbers easily close to 100 table, 200 millions rows, 200 -300 GB. I would suggest that we should have filtering to parse WALs once and to segregate in multiple WALs per table --> hFiles from per table wals. ( just a rough idea). was: I did a simple experiment of loading ~200 million rows on a table 1 and nothing in a table 2. This test was done on a local cluster ~ approx 3-4 containers were running in parallel. The focus of the test was not on how much time backup takes but on time spent on the table were no data has been changed. *Table without Data -->* Elapsed: 44mins, 52sec Average Map Time 3sec Average Shuffle Time 2mins, 35sec Average Merge Time 0sec Average Reduce Time 0sec Map : 2052 Reduce : 1 *Table with Data -->* Elapsed: 1hrs, 44mins, 10sec Average Map Time 4sec Average Shuffle Time 37sec Average Merge Time 3sec Average Reduce Time 47sec Map : 2052 Reduce : 64 All above numbers are a single node cluster so not many mappers run in parallel. but let's extrapolate this to 20 node cluster, with ~100 tables and data size to be backed up various for approx 2000 Wals, let us say each 20 node can process 3 containers i.e 60 wals in parallel. assume 3 sec are spent in each WALs i.e. 6000\ 60 sec --> 100 per table --> 10000 sec for all tables. ~166 hrs only for filtering. This does not seem to be scale. (These are just rough numbers from a basic test). As all parsing is O (m (WALS) * n (Tables)) Main intend of this test is to see even the backup of very less churning table might take good amount for just filtering the data. As number of table or data increases, this does not seem scalable Even i can see from our current cluster numbers easily close to 100 table, 200 millions rows, 200 -300 GB. I would suggest that we should have filtering to parse WALs once and to segregate in multiple WALs per table --> hFiles from per table wals. ( just a rough idea). > Backup scaling for multiple table and millions of row > ----------------------------------------------------- > > Key: HBASE-18872 > URL: https://issues.apache.org/jira/browse/HBASE-18872 > Project: HBase > Issue Type: Improvement > Reporter: Vishal Khandelwal > > I did a simple experiment of loading ~200 million rows on a table 1 and nothing in a table 2. This test was done on a local cluster ~ approx 3-4 containers were running in parallel. The focus of the test was not on how much time backup takes but on time spent on the table were no data has been changed. > *Table without Data -->* > Elapsed: 44mins, 52sec > Average Map Time 3sec > Average Shuffle Time 2mins, 35sec > Average Merge Time 0sec > Average Reduce Time 0sec > Map : 2052 > Reduce : 1 > *Table with Data -->* > Elapsed: 1hrs, 44mins, 10sec > Average Map Time 4sec > Average Shuffle Time 37sec > Average Merge Time 3sec > Average Reduce Time 47sec > Map : 2052 > Reduce : 64 > All above numbers are a single node cluster so not many mappers run in parallel. but let's extrapolate this to 20 node cluster, with ~100 tables and data size to be backed up various for approx 2000 Wals, let us say each 20 node can process 3 containers i.e 60 wals in parallel. assume 3 sec are spent in each WALs i.e. 6000\ 60 sec --> 100 per table --> 10000 sec for all tables. > ~166 mins --> ~2.7 only for filtering. This does not seem to be scale. (These are just rough numbers from a basic test). As all parsing is O (m (WALS) * n (Tables)) > Main intend of this test is to see even the backup of very less churning table might take good amount for just filtering the data. As number of table or data increases, this does not seem scalable > Even i can see from our current cluster numbers easily close to 100 table, 200 millions rows, 200 -300 GB. > I would suggest that we should have filtering to parse WALs once and to segregate in multiple WALs per table --> hFiles from per table wals. ( just a rough idea). -- This message was sent by Atlassian JIRA (v6.4.14#64029)