Thanks for the reply.
I used the cat command this time, the result is not great.
In my test, file hadoop003.log is cached while hadoop010.log is not cached.
-bash-4.1$  /hadoop/hadoop-2.3.0/bin/hadoop fs -ls
-rw-r--r--   3 hdfs hadoop  209715206 2014-03-06 18:14 hadoop003.log
-rw-r--r--   3 hdfs hadoop  209715272 2014-03-07 14:37 hadoop010.log
-bash-4.1$ hdfs cacheadmin -listDirectives -stats -path hadoop003.log
Found 1 entry
  5 wptest1      3 never   /user/hdfs/hadoop003.log      629145618     629145618             1             1
run first time
-bash-4.1$  time /hadoop/hadoop-2.3.0/bin/hadoop fs -cat hadoop003.log> /tmp/aa
real    0m4.881s
user    0m4.805s
sys     0m1.468s

-bash-4.1$  time /hadoop/hadoop-2.3.0/bin/hadoop fs -cat hadoop010.log> /tmp/aa
real    0m6.479s
user    0m4.777s
sys     0m1.312s

run 2nd time.
-bash-4.1$ time /hadoop/hadoop-2.3.0/bin/hadoop fs -cat hadoop003.log> /tmp/aa
real    0m4.751s
user    0m4.685s
sys     0m1.313s
-bash-4.1$ time /hadoop/hadoop-2.3.0/bin/hadoop fs -cat hadoop010.log> /tmp/aa
real    0m4.916s
user    0m4.779s
sys     0m1.378s
I did not see much cache improvement.
please advice.

If you take a look at this output, you can see that nothing is actually cached.

One way to figure out why this is is to look at the logs of the
NameNode and DataNode.  Some of the relevant logs are at DEBUG or
TRACE level, so you may need to turn up the logs.  The
CacheReplicationMonitor and FsDatasetCache classes are good places to

Also be sure to check that you have set dfs.datanode.max.locked.memory.

As Andrew commented, "hadoop tail" is not a good command to use for
measuring performance, since you have a few seconds of Java startup
time, followed by any HDFS setup time, followed by reading a single
kilobyte of data.  If you want to use the shell, the simplest thing to
do is to use cat and read a large file, so that those startup costs
don't dominate the measurement.


> -bash-4.1$ file /hadoop/hadoop-2.3.0/lib/native/
> /hadoop/hadoop-2.3.0/lib/native/ ELF 64-bit LSB shared
> object, x86-64, version 1 (SYSV), dynamically linked, not stripped
> I also tried the word count example with the same file. The execution time
> is always 40 seconds. (The map/reduce job without cache is 42 seconds)
> Is there anything wrong?
> Thanks a lot