hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mungeol Heo <mungeol....@gmail.com>
Subject Re: Why is the size of a HDFS file changed?
Date Mon, 09 Jan 2017 02:52:18 GMT
"^A" is used as delimiter in the file.
However, I don't think this is the reason causing the problem, because
there are files also using "^A" as delimiter but with no problem.
BTW, the reason using "^A" as delimiter is these files are hive data.

On Sat, Jan 7, 2017 at 12:17 AM, Ravi Prakash <ravihadoop@gmail.com> wrote:
> Is there a carriage return / new line / some other whitespace which `cat`
> may be appending?
>
> On Thu, Jan 5, 2017 at 6:09 PM, Mungeol Heo <mungeol.heo@gmail.com> wrote:
>>
>> Hello,
>>
>> Suppose, I name the HDFS file which cause the problem as A.
>>
>> hdfs dfs -ls A
>> -rw-r--r--   3 web_admin hdfs  868003931 2017-01-04 09:05 A
>>
>> hdfs dfs -get A AFromGet
>> hdfs dfs -cat A > AFromCat
>>
>> ls -l
>> -rw-r--r-- 1 hdfs hadoop 883715443 Jan  5 18:32 AFromGet
>> -rw-r--r-- 1 hdfs hadoop 883715443 Jan  5 18:32 AFromCat
>>
>> hdfs dfs -put AFromGet
>>
>> diff <(hdfs dfs -cat  A) <(hdfs dfs -cat AFromGet)
>> (no output, which means the contents of two files are same. At least,
>> after "cat")
>>
>> hdfs dfs -checksum A
>> A   MD5-of-262144MD5-of-512CRC32C
>> 000002000000000000040000e667fb4f0dda78101feb2b689af8260b
>>
>> hdfs dfs -checksum AFromGet
>> AFromGet   MD5-of-262144MD5-of-512CRC32C
>> 0000020000000000000400007284759249ff98c7395e6a4bb59343dc
>>
>> As I listed some results above. I wonder why is the size of the file
>> changed.
>> Any help will be GREAT!
>>
>> Thank you.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@hadoop.apache.org
>> For additional commands, e-mail: user-help@hadoop.apache.org
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@hadoop.apache.org
For additional commands, e-mail: user-help@hadoop.apache.org


Mime
View raw message