hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tao Xiao <xiaotao.cs....@gmail.com>
Subject Re: A non-empty file's size is reported as 0
Date Tue, 08 Apr 2014 13:43:53 GMT
My mapper code is as follows and I don't know whether any file is not
closed correctly.

public class TheMapper extends Mapper<LongWritable, Text, Text,
NullWritable> {
    private MultipleOutputs<Text, NullWritable> outputs;

    @Override
    protected void setup(Context ctx) {
        outputs = new MultipleOutputs<Text, NullWritable>(ctx);
    }

    @Override
    protected void map(LongWritable o, Text t, Context ctx) {
        outputs.write(t, NullWritable.get(), "2014-01-20/");
    }

    @Override
    protected void cleanup(Context ctx) {
        outputs.close();
    }
}




2014-04-08 21:17 GMT+08:00 Peyman Mohajerian <mohajeri@gmail.com>:

> If you didn't close the file correctly then NameNode wouldn't be notified
> of the final size of the file. The file size is meta-data coming from
> NameNode.
>
>
> On Tue, Apr 8, 2014 at 4:35 AM, Tao Xiao <xiaotao.cs.nju@gmail.com> wrote:
>
>> I wrote some data into a file using MultipleOutputs in mappers. I can see
>> the contents in this file using "hadoop fs -ls <file>", but its size is
>> reported as zero by the command "hadoop fs -du <file>" or "hadoop fs -ls
>> <file>" , as follows:
>>
>> -rw-r--r--   3 hadoop hadoop         0 2014-04-07 22:06
>> /test/xt/out/2014-01-20/-m-00000
>>
>> BTW, when I download this file from HDFS to local file system, I can see
>> the correct size. Why is it reported as zero size in hadoop CLI?
>>
>>
>>
>

Mime
View raw message