hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bijieshan <bijies...@huawei.com>
Subject RE: Why so many unexpected files like partitions_xxxx are created?
Date Mon, 16 Dec 2013 12:12:43 GMT
The reduce partition information is stored in this partition_XXXX file. See the below code:

    Path partitionsPath = new Path(job.getWorkingDirectory(),
                                   "partitions_" + UUID.randomUUID());
    LOG.info("Writing partition information to " + partitionsPath);

    FileSystem fs = partitionsPath.getFileSystem(conf);
    writePartitions(conf, partitionsPath, startKeys);

Hoping it helps.

-----Original Message-----
From: Tao Xiao [mailto:xiaotao.cs.nju@gmail.com] 
Sent: Monday, December 16, 2013 6:48 PM
To: user@hbase.apache.org
Subject: Why so many unexpected files like partitions_xxxx are created?

I imported data into HBase in the fashion of bulk load,  but after that I found many unexpected
file were created in the HDFS directory of /user/root/, and they like these:

... ...
... ...

It seems that they are HFiles, but I don't know why the were created here?

I bulk load data into HBase in the following way:

Firstly,   I wrote a MapReduce program which only has map tasks. The map
tasks read some text data and emit them in the form of  RowKey and KeyValue.The following
is my program:

        protected void map(NullWritable NULL, GtpcV1SignalWritable signal, Context ctx) throws
InterruptedException, IOException {
            String strRowkey = xxx;
            byte[] rowkeyBytes = Bytes.toBytes(strRowkey);



            KeyValue kv = new KeyValue(rowkeyBytes, Family_A, Qualifier_Q, part1.serialize());
            ctx.write(rowkey, kv);

            kv = new KeyValue(rowkeyBytes, Family_B, Qualifier_Q, part2.serialize());
            ctx.write(rowkey, kv);

after the MR programs finished, there were several HFiles generated in the output directory
I specified.

Then I bean to load these HFiles into HBase using the following command:
       hbase org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles
HFiles-Dir  MyTable

Finally , I could see that the data were indeed loaded into the table in HBase.

But, I could also see that there were many unexpected files generated in the HDFS directory
of  /user/root/,  just as I have mentioned at the begining of this mail,  and I did not specify
any files to be produced in this directory.

What happened ? Who can tell me what there files are and who produced them?

View raw message