Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 10DB918975 for ; Tue, 1 Dec 2015 17:02:07 +0000 (UTC) Received: (qmail 55715 invoked by uid 500); 1 Dec 2015 17:02:05 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 55215 invoked by uid 500); 1 Dec 2015 17:02:05 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 55203 invoked by uid 99); 1 Dec 2015 17:02:04 -0000 Received: from Unknown (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 01 Dec 2015 17:02:04 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 461E7C093F for ; Tue, 1 Dec 2015 17:02:04 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.9 X-Spam-Level: ** X-Spam-Status: No, score=2.9 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=3, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd4-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-us-east.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id JAXfveJSNJCB for ; Tue, 1 Dec 2015 17:01:55 +0000 (UTC) Received: from mail-lf0-f43.google.com (mail-lf0-f43.google.com [209.85.215.43]) by mx1-us-east.apache.org (ASF Mail Server at mx1-us-east.apache.org) with ESMTPS id CC2C3428FF for ; Tue, 1 Dec 2015 17:01:54 +0000 (UTC) Received: by lfs39 with SMTP id 39so14979999lfs.3 for ; Tue, 01 Dec 2015 09:01:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=17oUf0UdoLw/sLZGhFVJ4BdXyC1hq6bCQ6rmtjsWkCM=; b=abgpajlQ8U0W6LUY53kb9wNPE86/KdUKYwQkG+9SzmCYOEMOsYJIGoHgeO02BlDmMf PF9i57hDK6vJ7soJr8C/u8H/zOPtklHPmKK4hfYNI5KgQZ1R7guL0UrUKydd19zuzBvm cDPyacMzKTpFhJs86UsPOxsd/ZzuN8iTih8xu34VbIOs/co5YBY/i8hqi6++ib/edIgl Jgt3DxPGChg/wTbz/JKOZNmvYgeVD1fiYFnJXqo4XIS7Hp3fqdJVYGtt2FWQmujdV8VV 7k4dz+iwpqBlHLxPFIKVl36GdnolrMEGVwR3KrFdnK0jEA8J/nNNfYrfLSIdetWDukbM Pnhg== MIME-Version: 1.0 X-Received: by 10.25.17.232 with SMTP id 101mr31406672lfr.38.1448989312314; Tue, 01 Dec 2015 09:01:52 -0800 (PST) Received: by 10.112.182.101 with HTTP; Tue, 1 Dec 2015 09:01:52 -0800 (PST) In-Reply-To: References: Date: Tue, 1 Dec 2015 09:01:52 -0800 Message-ID: Subject: Re: Disk usage drops after RegionServer restart? (0.98) From: Vladimir Rodionov To: "user@hbase.apache.org" Content-Type: multipart/alternative; boundary=001a113f993a3e77f70525d91def --001a113f993a3e77f70525d91def Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable I think this is because some files with open handles get deleted. The space can be reclaimed only when process exit. This is known "feature" of Linux. -Vlad On Mon, Nov 30, 2015 at 10:56 PM, Stack wrote: > Thanks for writing back Otis. What was your CP doing? > St.Ack > > On Sat, Nov 28, 2015 at 7:08 PM, Otis Gospodneti=C4=87 < > otis.gospodnetic@gmail.com> wrote: > > > Hi, > > > > In our case it turned out to be co-processors. More specifically, than= ks > > to Logsene we would that one of our > > co-processors logged some exceptions on start. Once we fixed those > errors > > we stopped having issues with growing disk usage. Sorry I don't have > more > > details, but maybe this helps somebody. > > > > Otis > > -- > > Monitoring - Log Management - Alerting - Anomaly Detection > > Solr & Elasticsearch Consulting Support Training - http://sematext.com/ > > > > > > On Thu, Oct 29, 2015 at 1:52 PM, Stack wrote: > > > > > Are you printing out filesize (I don't see the -s arg on lsof). > > > St.Ack > > > > > > On Fri, Oct 23, 2015 at 8:08 PM, Otis Gospodneti=C4=87 < > > > otis.gospodnetic@gmail.com> wrote: > > > > > > > Hi Ted, > > > > > > > > 0.98.6-cdh5.3.0 > > > > > > > > I did actually try to use lsof, but I didn't see anything unusual > > there. > > > > Is there something specific I should look for? Things owned by hba= se > > > user > > > > or hdfs or yarn? Hm, here, I don't really see anything interesting > > > > > > > > $ sudo lsof| grep '/mnt' <=3D=3D this is where all data lives and w= here > > disk > > > > usage drops after RS restart > > > > > > > > java 2654 hdfs 1w REG 202,16 894= 87 > > > > 44042562 > > > > > > > > > > > > > > /mnt/hadoop-hdfs/log/hadoop-hdfs-datanode-spm-hbase-slave11.prod.sematext= .out > > > > java 2654 hdfs 2w REG 202,16 894= 87 > > > > 44042562 > > > > > > > > > > > > > > /mnt/hadoop-hdfs/log/hadoop-hdfs-datanode-spm-hbase-slave11.prod.sematext= .out > > > > java 2654 hdfs 286w REG 202,16 1089382= 05 > > > > 44044137 > > > > > > > > > > > > > > /mnt/hadoop-hdfs/log/hadoop-hdfs-datanode-spm-hbase-slave11.prod.sematext= .log > > > > java 2654 hdfs 289w REG 202,16 = 0 > > > > 44040203 /mnt/hadoop-hdfs/log/SecurityAuth-hdfs.audit > > > > java 2654 hdfs 314w REG 202,16 2614= 62 > > > > 44040213 > > > > > > > > > > > > > > /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/d= ncp_block_verification.log.curr > > > > java 2654 hdfs 316r REG 202,16 1342177= 28 > > > > 44045060 > > > > > > > > > > > > > > /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/c= urrent/finalized/subdir74/subdir58/blk_1078606358 > > > > java 2654 hdfs 318r REG 202,16 1342177= 28 > > > > 44057015 > > > > > > > > > > > > > > /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/c= urrent/finalized/subdir74/subdir224/blk_1078648930 > > > > java 2654 hdfs 319uW REG 202,16 = 36 > > > > 44042741 /mnt/hadoop-hdfs/data/in_use.lock > > > > java 2654 hdfs 321r REG 202,16 10485= 83 > > > > 44042793 > > > > > > > > > > > > > > /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/c= urrent/finalized/subdir75/subdir7/blk_1078658889_4918820.meta > > > > java 2654 hdfs 330u REG 202,16 3525= 63 > > > > 44048279 > > > > > > > > > > > > > > /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/c= urrent/rbw/blk_1078675432_4935363.meta > > > > java 2654 hdfs 333r REG 202,16 1342177= 28 > > > > 44055769 > > > > > > > > > > > > > > /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/c= urrent/finalized/subdir75/subdir9/blk_1078659381 > > > > java 2654 hdfs 335u REG 202,16 451271= 68 > > > > 44048273 > > > > > > > > > > > > > > /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/c= urrent/rbw/blk_1078675432 > > > > java 2654 hdfs 340r REG 202,16 1342177= 28 > > > > 44042791 > > > > > > > > > > > > > > /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/c= urrent/finalized/subdir75/subdir7/blk_1078658889 > > > > java 2654 hdfs 343r REG 202,16 138821= 19 > > > > 44048053 > > > > > > > > > > > > > > /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/c= urrent/finalized/subdir75/subdir71/blk_1078675385 > > > > java 2654 hdfs 345u REG 202,16 4850= 59 > > > > 44048209 > > > > > > > > > > > > > > /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/c= urrent/rbw/blk_1078675399_4935330.meta > > > > java 2654 hdfs 346r REG 202,16 1342177= 28 > > > > 44053723 > > > > > > > > > > > > > > /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/c= urrent/finalized/subdir75/subdir4/blk_1078658098 > > > > java 2654 hdfs 347u REG 202,16 3714= 55 > > > > 44047931 > > > > > > > > > > > > > > /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/c= urrent/rbw/blk_1078675364_4935295.meta > > > > java 2654 hdfs 348u REG 202,16 475452= 82 > > > > 44047927 > > > > > > > > > > > > > > /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/c= urrent/rbw/blk_1078675364 > > > > java 2654 hdfs 354u REG 202,16 203864= 05 > > > > 44047875 > > > > > > > > > > > > > > /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/c= urrent/finalized/subdir75/subdir8/blk_1078659266 > > > > java 2654 hdfs 355r REG 202,16 1342177= 28 > > > > 44042762 > > > > > > > > > > > > > > /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/c= urrent/finalized/subdir74/subdir243/blk_1078653797 > > > > java 2654 hdfs 357r REG 202,16 1342177= 28 > > > > 44042535 > > > > > > > > > > > > > > /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/c= urrent/finalized/subdir75/subdir66/blk_1078674123 > > > > java 2654 hdfs 359u REG 202,16 18= 39 > > > > 44045445 > > > > > > > > > > > > > > /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/c= urrent/rbw/blk_1078674506_4934437.meta > > > > java 2654 hdfs 360u REG 202,16 2341= 30 > > > > 44045440 > > > > > > > > > > > > > > /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/c= urrent/rbw/blk_1078674506 > > > > java 2654 hdfs 363r REG 202,16 206294= 37 > > > > 44046774 > > > > > > > > > > > > > > /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/c= urrent/finalized/subdir75/subdir17/blk_1078661533 > > > > java 2654 hdfs 369r REG 202,16 183049= 45 > > > > 44047599 > > > > > > > > > > > > > > /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/c= urrent/finalized/subdir75/subdir71/blk_1078675270 > > > > java 2654 hdfs 370r REG 202,16 620864= 13 > > > > 44048199 > > > > > > > > > > > > > > /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/c= urrent/rbw/blk_1078675399 > > > > java 2654 hdfs 379r REG 202,16 1342177= 28 > > > > 44050035 > > > > > > > > > > > > > > /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/c= urrent/finalized/subdir75/subdir3/blk_1078657983 > > > > java 2654 hdfs 390u REG 202,16 208577= 80 > > > > 44050270 > > > > > > > > > > > > > > /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/c= urrent/finalized/subdir75/subdir8/blk_1078659267 > > > > java 2654 hdfs 408r REG 202,16 1154533= 75 > > > > 44042299 > > > > > > > > > > > > > > /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/c= urrent/finalized/subdir75/subdir66/blk_1078674120 > > > > java 2654 hdfs 415r REG 202,16 202531= 92 > > > > 44053520 > > > > > > > > > > > > > > /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/c= urrent/finalized/subdir75/subdir60/blk_1078672624 > > > > java 2654 hdfs 423r REG 202,16 183828= 78 > > > > 44047547 > > > > > > > > > > > > > > /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/c= urrent/finalized/subdir75/subdir71/blk_1078675257 > > > > java 2654 hdfs 424r REG 202,16 195555= 59 > > > > 44040692 > > > > > > > > > > > > > > /mnt/hadoop-hdfs/data/current/BP-282774069-10.123.212.150-1419335230604/c= urrent/finalized/subdir75/subdir65/blk_1078673801 > > > > bash 15005 ec2-user cwd DIR 202,16 40= 96 > > > > 2 /mnt > > > > sudo 16055 root cwd DIR 202,16 40= 96 > > > > 2 /mnt > > > > grep 16056 ec2-user cwd DIR 202,16 40= 96 > > > > 2 /mnt > > > > sed 16057 ec2-user cwd DIR 202,16 40= 96 > > > > 2 /mnt > > > > lsof 16058 root cwd DIR 202,16 40= 96 > > > > 2 /mnt > > > > lsof 16059 root cwd DIR 202,16 40= 96 > > > > 2 /mnt > > > > bash 18748 hbase 1w REG 202,16 128= 43 > > > > 4980744 > > > > > > > > > > /mnt/hbase/log/hbase-hbase-regionserver-spm-hbase-slave11.prod.sematext.o= ut > > > > bash 18748 hbase 2w REG 202,16 128= 43 > > > > 4980744 > > > > > > > > > > /mnt/hbase/log/hbase-hbase-regionserver-spm-hbase-slave11.prod.sematext.o= ut > > > > java 18761 hbase 1w REG 202,16 128= 43 > > > > 4980744 > > > > > > > > > > /mnt/hbase/log/hbase-hbase-regionserver-spm-hbase-slave11.prod.sematext.o= ut > > > > java 18761 hbase 2w REG 202,16 128= 43 > > > > 4980744 > > > > > > > > > > /mnt/hbase/log/hbase-hbase-regionserver-spm-hbase-slave11.prod.sematext.o= ut > > > > java 18761 hbase 338w REG 202,16 1175377= 86 > > > > 4980753 > > > > > > > > > > /mnt/hbase/log/hbase-hbase-regionserver-spm-hbase-slave11.prod.sematext.l= og > > > > java 18761 hbase 339w REG 202,16 = 0 > > > > 4980741 /mnt/hbase/log/SecurityAuth.audit > > > > java 29057 yarn 1w REG 202,16 1301= 05 > > > > 51380228 > > > > > > > > > > > > > > /mnt/hadoop-yarn/log/yarn-yarn-nodemanager-spm-hbase-slave11.prod.sematex= t.out > > > > java 29057 yarn 2w REG 202,16 1301= 05 > > > > 51380228 > > > > > > > > > > > > > > /mnt/hadoop-yarn/log/yarn-yarn-nodemanager-spm-hbase-slave11.prod.sematex= t.out > > > > java 29057 yarn 286w REG 202,16 1036112= 55 > > > > 51380852 > > > > > > > > > > > > > > /mnt/hadoop-yarn/log/yarn-yarn-nodemanager-spm-hbase-slave11.prod.sematex= t.log > > > > > > > > I don't see anything big there... > > > > > > > > Thanks, > > > > Otis > > > > -- > > > > Monitoring - Log Management - Alerting - Anomaly Detection > > > > Solr & Elasticsearch Consulting Support Training - > > http://sematext.com/ > > > > > > > > > > > > On Fri, Oct 23, 2015 at 10:26 PM, Ted Yu > wrote: > > > > > > > > > Which specific release of 0.98 are you using ? > > > > > > > > > > Have you used lsof to see which files were being held onto ? > > > > > > > > > > Thanks > > > > > > > > > > On Fri, Oct 23, 2015 at 7:21 PM, Otis Gospodneti=C4=87 < > > > > > otis.gospodnetic@gmail.com> wrote: > > > > > > > > > > > Hello, > > > > > > > > > > > > Is/was there a known issue with HBase 0.98 "holding onto" files= ? > > > > > > > > > > > > We noticed the used disk space metric going up, up and up and w= e > > > could > > > > > not > > > > > > stop it with major compaction. > > > > > > But we noticed that if we restart a RegionServer 2 things happe= n: > > > > > > 1) its disk usage immediately drops a lot > > > > > > 2) the disk usage of other RegionServers drops some as well > > > > > > > > > > > > Have a look at this chart: > > > > > > https://apps.sematext.com/spm-reports/s/Ssy4ViFGHq > > > > > > > > > > > > At 1:54 we restarted the first RS (blue line) > > > > > > At 2:03 we restarted the second RS (dark green line) > > > > > > > > > > > > Is/was this a known HBase 0.98 issue? > > > > > > > > > > > > Thanks, > > > > > > Otis > > > > > > -- > > > > > > Monitoring - Log Management - Alerting - Anomaly Detection > > > > > > Solr & Elasticsearch Consulting Support Training - > > > > http://sematext.com/ > > > > > > > > > > > > > > > > > > > > > --001a113f993a3e77f70525d91def--