hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ravi Prakash <ravihad...@gmail.com>
Subject Re: Why is the size of a HDFS file changed?
Date Fri, 06 Jan 2017 15:17:08 GMT
Is there a carriage return / new line / some other whitespace which `cat`
may be appending?

On Thu, Jan 5, 2017 at 6:09 PM, Mungeol Heo <mungeol.heo@gmail.com> wrote:

> Hello,
>
> Suppose, I name the HDFS file which cause the problem as A.
>
> hdfs dfs -ls A
> -rw-r--r--   3 web_admin hdfs  868003931 2017-01-04 09:05 A
>
> hdfs dfs -get A AFromGet
> hdfs dfs -cat A > AFromCat
>
> ls -l
> -rw-r--r-- 1 hdfs hadoop 883715443 Jan  5 18:32 AFromGet
> -rw-r--r-- 1 hdfs hadoop 883715443 Jan  5 18:32 AFromCat
>
> hdfs dfs -put AFromGet
>
> diff <(hdfs dfs -cat  A) <(hdfs dfs -cat AFromGet)
> (no output, which means the contents of two files are same. At least,
> after "cat")
>
> hdfs dfs -checksum A
> A   MD5-of-262144MD5-of-512CRC32C
> 000002000000000000040000e667fb4f0dda78101feb2b689af8260b
>
> hdfs dfs -checksum AFromGet
> AFromGet   MD5-of-262144MD5-of-512CRC32C
> 0000020000000000000400007284759249ff98c7395e6a4bb59343dc
>
> As I listed some results above. I wonder why is the size of the file
> changed.
> Any help will be GREAT!
>
> Thank you.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@hadoop.apache.org
> For additional commands, e-mail: user-help@hadoop.apache.org
>
>

Mime
View raw message