Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: hdfs-issues@hadoop.apache.org
Date: Thu, 19 Dec 2013 22:07:08 +0000 (UTC)
From: "Andrew Wang (JIRA)" <jira@apache.org>
To: hdfs-issues@hadoop.apache.org
Message-ID: <JIRA.12684170.1386830984455.67647.1387490828066@arcas>
In-Reply-To: <JIRA.12684170.1386830984455@arcas>
References: <JIRA.12684170.1386830984455@arcas>
Subject: [jira] [Updated] (HDFS-5659) dfsadmin -report doesn't output cache
 information properly
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


     [ https://issues.apache.org/jira/browse/HDFS-5659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrew Wang updated HDFS-5659:
------------------------------

    Attachment: hdfs-5659-1.patch

Thanks for the report, patch attached. Two notes:

* I refactored the PBHelper methods for DatanodeInfo in what I think is a compatible fashion. Ideally there'd only be one convert method, but I didn't want to tease out the behavior there.
* The test blocksize (512) was less than the page size (4096), so it was getting automatically rounded up to the page size on the DN, leading to unexpected numbers. The same issue crops up on the namenode when it comes to quotas and stats; we won't hit our perceived capacity if we're caching a bunch of (n%PAGE_SIZE+1) files because of this fragmentation. I don't think this is a big deal (we're looking at worst case 4k waste per cached file), but it's worth keeping in mind.

> dfsadmin -report doesn't output cache information properly
> ----------------------------------------------------------
>
>                 Key: HDFS-5659
>                 URL: https://issues.apache.org/jira/browse/HDFS-5659
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: caching
>    Affects Versions: 3.0.0
>            Reporter: Akira AJISAKA
>            Assignee: Andrew Wang
>         Attachments: hdfs-5659-1.patch
>
>
> I tried to cache a file by "hdfs cacheadmin -addDirective".
> I thought the file was cached because "CacheUsed" at jmx was more than 0.
> {code}
> {
>     "name" : "Hadoop:service=DataNode,name=FSDatasetState-DS-1043926324-172.28.0.102-50010-1385087929296",
>     "modelerType" : "org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl",
>     "Remaining" : 5604772597760,
>     "StorageInfo" : "FSDataset{dirpath='[/hadoop/data1/dfs/data/current, /hadoop/data2/dfs/data/current, /hadoop/data3/dfs/data/current]'}",
>     "Capacity" : 5905374474240,
>     "DfsUsed" : 11628544,
>     "CacheCapacity" : 1073741824,
>     "CacheUsed" : 360448,
>     "NumFailedVolumes" : 0,
>     "NumBlocksCached" : 1,
>     "NumBlocksFailedToCache" : 0,
>     "NumBlocksFailedToUncache" : 0
>   },
> {code}
> But "dfsadmin -report" didn't output the same value as jmx.
> {code}
> Configured Cache Capacity: 0 (0 B)
> Cache Used: 0 (0 B)
> Cache Remaining: 0 (0 B)
> Cache Used%: 100.00%
> Cache Remaining%: 0.00%
> {code}


--
This message was sent by Atlassian JIRA
(v6.1.4#6159)