[jira] [Commented] (HDFS-9197) Snapshot FileDiff added to last snapshot when INodeFile accessTime field is updated
Colin Patrick McCabe commented on HDFS-9197:

Hi [~axenol], I definitely agree that we should be able to see these additional fields being
added via snapshotDiff.  If we can't, that's frustrating.

"Access time" is a very frustrating feature.  As you noticed, it essentially turns every read
operation into a write operation.  This has a lot of bad effects... on local filesystems,
it generates lots of writes to disk just from reading files.  It complicates the implementation
of distributed filesystems.  Clearly, it also increases snapshot sizes since writes need to
be recorded in snapshots.

These bad things are all fundamental to atime.  This is one reason why many sysadmins always
run with "noatime."  It's why Linux distributions ship with "relatime" (reduced granularity
atime) by default.  HDFS's {{dfs.namenode.access.time.precision}} configuration key is similar.
 We could consider changing {{dfs.namenode.access time.precision}} to 0 by default in Hadoop
3.0, since most would consider it an incompatible change.

> Snapshot FileDiff added to last snapshot when INodeFile accessTime field is updated
> -----------------------------------------------------------------------------------
>                 Key: HDFS-9197
>                 URL: https://issues.apache.org/jira/browse/HDFS-9197
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: snapshots
>    Affects Versions: 2.3.0, 2.4.0
>            Reporter: Alex Ivanov
> Summary
> When a file in HDFS is read, its corresponding inode's accessTime field is updated. If
the file is present in the last snapshot, the accessTime change causes a FileDiff to be added
to the SnapshotDiff of the last snapshot.
> This behavior has the following problems:
> - Since FileDiff's reside in memory on the namenodes, snapshots become progressively
more memory-heavy with increasing volume of data in hdfs. On a system with frequent updates,
e.g. hourly, this becomes a big problem since for, say 2000 snapshots, one can have 2000 FileDiff's
per file pointing to the same inode.
> - FSImage grows in size tremendously, and upload operation from standby to active namenode
takes much longer.
> -The generated FileDiff does not contain any useful information that I can see. Since
all FileDiff's for that file are pointing to the same inode, the accessTime they see is the
> - I was wrong about the last point. Each FileDiff includes a SnapshotCopy attribute,
which contains the updated accessTime. This may be a feature, but I'd question the value of
having it enabled by default.
> Configuration:
> CDH 5.0.5 (Hadoop 2.3 / 2.4)
> We are NOT overwriting the default parameter:
> Note that it determines the allowed frequency of accessTime field updates - every hour
by default.
> How to reproduce:
> {code}
> [root@node1076]# hdfs dfs -ls /data/tenants/testenv.testtenant/wddata
> Found 3 items
> drwxr-xr-x   - hdfs hadoop          0 2015-10-04 10:52 /data/tenants/testenv.testtenant/wddata/folder1
> -rw-r--r--   3 hdfs hadoop         38 2015-10-05 03:13 /data/tenants/testenv.testtenant/wddata/testfile1
> -rw-r--r--   3 hdfs hadoop         21 2015-10-04 10:45 /data/tenants/testenv.testtenant/wddata/testfile2
> [root@node1076]# hdfs dfs -ls /data/tenants/testenv.testtenant/wddata/.snapshot
> Found 8 items
> drwxr-xr-x   - hdfs hadoop          0 2015-10-04 10:47 /data/tenants/testenv.testtenant/wddata/.snapshot/sn1
> drwxr-xr-x   - hdfs hadoop          0 2015-10-04 10:47 /data/tenants/testenv.testtenant/wddata/.snapshot/sn2
> drwxr-xr-x   - hdfs hadoop          0 2015-10-04 10:52 /data/tenants/testenv.testtenant/wddata/.snapshot/sn3
> drwxr-xr-x   - hdfs hadoop          0 2015-10-04 10:53 /data/tenants/testenv.testtenant/wddata/.snapshot/sn4
> drwxr-xr-x   - hdfs hadoop          0 2015-10-04 10:57 /data/tenants/testenv.testtenant/wddata/.snapshot/sn5
> drwxr-xr-x   - hdfs hadoop          0 2015-10-04 10:58 /data/tenants/testenv.testtenant/wddata/.snapshot/sn6
> drwxr-xr-x   - hdfs hadoop          0 2015-10-05 03:13 /data/tenants/testenv.testtenant/wddata/.snapshot/sn7
> drwxr-xr-x   - hdfs hadoop          0 2015-10-05 04:20 /data/tenants/testenv.testtenant/wddata/.snapshot/sn8
> [root@node1076]# hdfs dfs -createSnapshot /data/tenants/testenv.testtenant/wddata sn9
> Created snapshot /data/tenants/testenv.testtenant/wddata/.snapshot/sn9
> [root@node1076]# hdfs snapshotDiff /data/tenants/testenv.testtenant/wddata sn8 sn9
> Difference between snapshot sn8 and snapshot sn9 under directory /data/tenants/testenv.testtenant/wddata:
> ################
> ## IMPORTANT: testfile1 was put into HDFS more than 1 hour ago, which triggers the accessTime
> ################
> [root@node1076]# hdfs dfs -cat /data/tenants/testenv.testtenant/wddata/testfile1
> This is test file 1, but now it's 11.
> [root@node1076]# hdfs dfs -createSnapshot /data/tenants/testenv.testtenant/wddata sn10
> Created snapshot /data/tenants/testenv.testtenant/wddata/.snapshot/sn10
> [root@node1076]# hdfs snapshotDiff /data/tenants/testenv.testtenant/wddata sn9 sn10
> Difference between snapshot sn9 and snapshot sn10 under directory /data/tenants/testenv.testtenant/wddata:
> M	./testfile1
> {code}

