From hdfs-issues-return-217923-archive-asf-public=cust-asf.ponee.io@hadoop.apache.org Tue Apr 24 22:56:24 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id 68500180671 for ; Tue, 24 Apr 2018 22:56:24 +0200 (CEST) Received: (qmail 97687 invoked by uid 500); 24 Apr 2018 20:56:16 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 97591 invoked by uid 99); 24 Apr 2018 20:56:16 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 24 Apr 2018 20:56:16 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 235F7180119 for ; Tue, 24 Apr 2018 20:56:16 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -110.31 X-Spam-Level: X-Spam-Status: No, score=-110.31 tagged_above=-999 required=6.31 tests=[ENV_AND_HDR_SPF_MATCH=-0.5, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01, USER_IN_DEF_SPF_WL=-7.5, USER_IN_WHITELIST=-100, WEIRD_PORT=0.001] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id KYrhH_vlBV16 for ; Tue, 24 Apr 2018 20:56:15 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id 34FCB6100A for ; Tue, 24 Apr 2018 20:56:12 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 37C99E12D9 for ; Tue, 24 Apr 2018 20:56:10 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 1CECA241CF for ; Tue, 24 Apr 2018 20:56:10 +0000 (UTC) Date: Tue, 24 Apr 2018 20:56:10 +0000 (UTC) From: "Hudson (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HDFS-12506) Ozone: ListBucket is too slow MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HDFS-12506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16451011#comment-16451011 ] Hudson commented on HDFS-12506: ------------------------------- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14057 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/14057/]) HDFS-12506. Ozone: ListBucket is too slow. Contributed by Weiwei Yang. (wwei: rev e01245495f71a20a5478c29c32d849d4b2720c57) * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/ozone/ksm/KSMMetadataManagerImpl.java * (edit) hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/ozone/OzoneConsts.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/ozone/TestMetadataStore.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/utils/MetadataStore.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/utils/RocksDBStore.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/ozone/web/client/TestBuckets.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/ozone/scm/cli/SQLCLI.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/ozone/ksm/TestBucketManagerImpl.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/utils/LevelDBStore.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/ozone/web/client/TestVolume.java > Ozone: ListBucket is too slow > ----------------------------- > > Key: HDFS-12506 > URL: https://issues.apache.org/jira/browse/HDFS-12506 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ozone > Reporter: Weiwei Yang > Assignee: Weiwei Yang > Priority: Blocker > Labels: ozoneMerge, performance > Fix For: HDFS-7240 > > Attachments: HDFS-12506-HDFS-7240.001.patch, HDFS-12506-HDFS-7240.002.patch, HDFS-12506-HDFS-7240.003.patch, HDFS-12506-HDFS-7240.004.patch, HDFS-12506-HDFS-7240.005.patch, HDFS-12506-HDFS-7240.006.patch, HDFS-12506-HDFS-7240.007.patch > > > Generated 3 million keys in ozone, and run {{listBucket}} command to get a list of buckets under a volume, > {code} > bin/hdfs oz -listBucket http://15oz1.fyre.ibm.com:9864/vol-0-15143 -user wwei > {code} > this call spent over *15 seconds* to finish. The problem was caused by the inflexible structure of KSM DB. Right now {{ksm.db}} stores keys like following > {code} > /v1/b1 > /v1/b1/k1 > /v1/b1/k2 > /v1/b1/k3 > /v1/b2 > /v1/b2/k1 > /v1/b2/k2 > /v1/b2/k3 > /v1/b3 > /v1/b4 > {code} > keys are sorted in nature order so when we do list buckets under a volume e.g /v1, we need to seek to /v1 point and start to iterate and filter keys, this ends up with scanning all keys under volume /v1. The problem with this design is we don't have an efficient approach to locate all buckets without scanning the keys. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org