hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wellington Chevreuil (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-12618) fsck -includeSnapshots reports wrong amount of total blocks
Date Mon, 23 Oct 2017 10:50:00 GMT

    [ https://issues.apache.org/jira/browse/HDFS-12618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16214955#comment-16214955
] 

Wellington Chevreuil commented on HDFS-12618:
---------------------------------------------

Hi [~xiaochen], thanks for reviewing and sharing your thoughts. Yeah, for Web UI is just a
matter of showing the total number of blocks currently mapped on BlockManager. Regarding the
issues you had highlighted previously, I have some comments below:

bq. snapshotSeenBlocks could be huge, if the snapshot dir is at the top level, and has a lot
of blocks under it. In extreme cases, this may put pressure on NN. At the minimum we should
make sure the extra space is allocated only when -includeSnapshots is set.
That's indeed something that concerned me while thinking on this. To minimize heap pressure
effects, this only adds the 1st block of each snapshot file already checked. It also adds
only blocks from files that exist only os snapshots (files that had been already deleted).
So, although the *ArrayList* is created on a generic scope, it will only be really populated
with blocks in case where "-includeSnapshots" is passed. Unfortunately, because *checkDir*
is invoked recursively, I needed to initialise the *ArrayList* at first cal to *checkDir*,
from *check* method.

bq. where inodes are involved, file system locks are required. This adds burden to the NN
which should be minimized.
I guess this is related to this block of code below, where I'm not acquiring any namesystem
lock (apart from the one done on *getBlockLocations*), which could lead to inconsistent report
of blocks:
{noformat}
      FSDirectory dir = namenode.getNamesystem().getFSDirectory();
      final INodesInPath iip = dir.getINodesInPath(filePath,
          FSDirectory.DirOp.READ);
      final INodeFile inode = INodeFile.valueOf(iip.getLastINode(), filePath);
      LocatedBlocks blocks = getBlockLocations(filePath, snapshotFileInfo);
      printFileStats(filePath, blocks, snapshotFileInfo);
      if(inode.isWithSnapshot()){
        if(!seenBlocks.contains(inode.getBlocks()[0].getBlockName())) {
          replRes.totalBlocks += inode.getBlocks().length;
          seenBlocks.add(inode.getBlocks()[0].getBlockName());
        }
      }
{noformat}

I guess that should be rewritten as below:

{noformat}
FSDirectory dir = namenode.getNamesystem().getFSDirectory();
      final INodesInPath iip = dir.getINodesInPath(filePath,
          FSDirectory.DirOp.READ);
      namenode.getNamesystem().readLock();
      try {
        final INodeFile inode = INodeFile.valueOf(iip.getLastINode(), filePath);
        if (inode.isWithSnapshot()) {
          if (!seenBlocks.contains(inode.getBlocks()[0].getBlockName())) {
            replRes.totalBlocks += inode.getBlocks().length;
            seenBlocks.add(inode.getBlocks()[0].getBlockName());
          }
        }
      }finally {
        namenode.getNamesystem().readUnlock();
      }
      LocatedBlocks blocks = getBlockLocations(filePath, snapshotFileInfo);
      printFileStats(filePath, blocks, snapshotFileInfo);
{noformat}
Agreed that this can cause concurrency issues. Still this code would be executed only for
snapshot dirs. I can't see other way to get the block info details required to avoid over
counting these blocks.

bq.Catching a RunTimeException and re-throw looks like a bad practice. What are the RTEs that
could be thrown that we need to wrap?
This was a problem I came with while using java lambda functions. The problem is that I had
to declare a *java.util.function.Consumer* param on *checkDir* signature, in order to be able
to pass both *checkFilesInSnapshotOnly* and *check* calls as lambda functions, but *java.util.function.Consumer.accept*
does not declare any checked exception. That causes any *IOException* thrown by *checkFilesInSnapshotOnly*
or *check* methods when executed from a lambda context to be suppressed. Only way I had found
to keep any of the *IOException* occurrences to be tracked on the outer method was to wrap
it on a RTE, forcibly catch it on the outer method and recover the original *IOException*
again by unwrapping it. I still thought using lambdas though would require less refactoring
while avoiding code duplication.

Please share ony other ideas/concerns you may have.

> fsck -includeSnapshots reports wrong amount of total blocks
> -----------------------------------------------------------
>
>                 Key: HDFS-12618
>                 URL: https://issues.apache.org/jira/browse/HDFS-12618
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: tools
>    Affects Versions: 3.0.0-alpha3
>            Reporter: Wellington Chevreuil
>            Assignee: Wellington Chevreuil
>            Priority: Minor
>         Attachments: HDFS-121618.initial, HDFS-12618.001.patch
>
>
> When snapshot is enabled, if a file is deleted but is contained by a snapshot, *fsck*
will not reported blocks for such file, showing different number of *total blocks* than what
is exposed in the Web UI. 
> This should be fine, as *fsck* provides *-includeSnapshots* option. The problem is that
*-includeSnapshots* option causes *fsck* to count blocks for every occurrence of a file on
snapshots, which is wrong because these blocks should be counted only once (for instance,
if a 100MB file is present on 3 snapshots, it would still map to one block only in hdfs).
This causes fsck to report much more blocks than what actually exist in hdfs and is reported
in the Web UI.
> Here's an example:
> 1) HDFS has two files of 2 blocks each:
> {noformat}
> $ hdfs dfs -ls -R /
> drwxr-xr-x   - root supergroup          0 2017-10-07 21:21 /snap-test
> -rw-r--r--   1 root supergroup  209715200 2017-10-07 20:16 /snap-test/file1
> -rw-r--r--   1 root supergroup  209715200 2017-10-07 20:17 /snap-test/file2
> drwxr-xr-x   - root supergroup          0 2017-05-13 13:03 /test
> {noformat} 
> 2) There are two snapshots, with the two files present on each of the snapshots:
> {noformat}
> $ hdfs dfs -ls -R /snap-test/.snapshot
> drwxr-xr-x   - root supergroup          0 2017-10-07 21:21 /snap-test/.snapshot/snap1
> -rw-r--r--   1 root supergroup  209715200 2017-10-07 20:16 /snap-test/.snapshot/snap1/file1
> -rw-r--r--   1 root supergroup  209715200 2017-10-07 20:17 /snap-test/.snapshot/snap1/file2
> drwxr-xr-x   - root supergroup          0 2017-10-07 21:21 /snap-test/.snapshot/snap2
> -rw-r--r--   1 root supergroup  209715200 2017-10-07 20:16 /snap-test/.snapshot/snap2/file1
> -rw-r--r--   1 root supergroup  209715200 2017-10-07 20:17 /snap-test/.snapshot/snap2/file2
> {noformat}
> 3) *fsck -includeSnapshots* reports 12 blocks in total (4 blocks for the normal file
path, plus 4 blocks for each snapshot path):
> {noformat}
> $ hdfs fsck / -includeSnapshots
> FSCK started by root (auth:SIMPLE) from /127.0.0.1 for path / at Mon Oct 09 15:15:36
BST 2017
> Status: HEALTHY
>  Number of data-nodes:	1
>  Number of racks:		1
>  Total dirs:			6
>  Total symlinks:		0
> Replicated Blocks:
>  Total size:	1258291200 B
>  Total files:	6
>  Total blocks (validated):	12 (avg. block size 104857600 B)
>  Minimally replicated blocks:	12 (100.0 %)
>  Over-replicated blocks:	0 (0.0 %)
>  Under-replicated blocks:	0 (0.0 %)
>  Mis-replicated blocks:		0 (0.0 %)
>  Default replication factor:	1
>  Average block replication:	1.0
>  Missing blocks:		0
>  Corrupt blocks:		0
>  Missing replicas:		0 (0.0 %)
> {noformat}
> 4) Web UI shows the correct number (4 blocks only):
> {noformat}
> Security is off.
> Safemode is off.
> 5 files and directories, 4 blocks = 9 total filesystem object(s).
> {noformat}
> I would like to work on this solution, will propose an initial solution shortly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message