From common-issues-return-172885-archive-asf-public=cust-asf.ponee.io@hadoop.apache.org  Tue May 21 19:12:01 2019
Return-Path: <common-issues-return-172885-archive-asf-public=cust-asf.ponee.io@hadoop.apache.org>
X-Original-To: archive-asf-public@cust-asf.ponee.io
Delivered-To: archive-asf-public@cust-asf.ponee.io
Received: from mail.apache.org (hermes.apache.org [207.244.88.153])
	by mx-eu-01.ponee.io (Postfix) with SMTP id 21E1E18061A
	for <archive-asf-public@cust-asf.ponee.io>; Tue, 21 May 2019 21:12:01 +0200 (CEST)
Received: (qmail 23981 invoked by uid 500); 21 May 2019 19:12:00 -0000
Mailing-List: contact common-issues-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
List-Help: <mailto:common-issues-help@hadoop.apache.org>
List-Unsubscribe: <mailto:common-issues-unsubscribe@hadoop.apache.org>
List-Post: <mailto:common-issues@hadoop.apache.org>
List-Id: <common-issues.hadoop.apache.org>
Delivered-To: mailing list common-issues@hadoop.apache.org
Received: (qmail 23970 invoked by uid 99); 21 May 2019 19:12:00 -0000
Received: from ec2-52-202-80-70.compute-1.amazonaws.com (HELO gitbox.apache.org) (52.202.80.70)
    by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 21 May 2019 19:12:00 +0000
From: GitBox <git@apache.org>
To: common-issues@hadoop.apache.org
Subject: [GitHub] [hadoop] arp7 commented on a change in pull request #832:
 HDDS-1535. Space tracking for Open Containers : Handle Node Startup.
 Contributed by Supratim Deka
Message-ID: <155846592020.6768.10240141287036314895.gitbox@gitbox.apache.org>
Date: Tue, 21 May 2019 19:12:00 -0000
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit

arp7 commented on a change in pull request #832: HDDS-1535. Space tracking for Open Containers : Handle Node Startup. Contributed by Supratim Deka
URL: https://github.com/apache/hadoop/pull/832#discussion_r286181220
 
 
 ##########
 File path: hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/ozoneimpl/ContainerReader.java
 ##########
 @@ -215,4 +224,27 @@ public void verifyContainerData(ContainerData containerData)
           ContainerProtos.Result.UNKNOWN_CONTAINER_TYPE);
     }
   }
+
+  private void initializeUsedBytes(KeyValueContainer container)
+      throws IOException {
+    KeyValueBlockIterator blockIter = new KeyValueBlockIterator(
+        container.getContainerData().getContainerID(),
+        new File(container.getContainerData().getContainerPath()));
+    long usedBytes = 0;
+
+    while (blockIter.hasNext()) {
+      BlockData block = blockIter.nextBlock();
+      long blockLen = 0;
+
+      List<ContainerProtos.ChunkInfo> chunkInfoList = block.getChunks();
 
 Review comment:
   Hi @supratimdeka , it looks like we are initializing the counts using the metadata in RocksDB. I wonder if it is better to initialize this value using bytes on disk i.e. count usage of all the chunk files.
   
   Both should match usually. However if there are chunk files that were written but never committed successfully then there could be a discrepancy. i.e. failed commit.
   
   I don't think we are actively deleting such unreferenced chunk files currently, so if there are lot of failed writes the container could go far beyond its maximum capacity. This is a failure that could never be seen by DataNode so short of deleting the chunk files via scanner there is not much we can do.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org