hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tao Xiao <xiaotao.cs....@gmail.com>
Subject Re: Why so many unexpected files like partitions_xxxx are created?
Date Wed, 18 Dec 2013 02:31:33 GMT
BTW, I noticed another problem. I bulk load data into HBase every five
minutes, but I found that whenever the following command was executed
    hbase org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles
HFiles-Dir  MyTable

there is a new process called "LoadIncrementalHFiles"

I can see many processes called "LoadIncrementalHFiles" using the command
"jps" in the terminal´╝î why are these processes still there even after the
command that bulk load HFiles into HBase has finished executing ? I have to
kill them myself.


2013/12/17 Bijieshan <bijieshan@huawei.com>

> Yes, it should be cleaned up. But not included in current code in my
> understanding.
>
> Jieshan.
> -----Original Message-----
> From: Ted Yu [mailto:yuzhihong@gmail.com]
> Sent: Tuesday, December 17, 2013 10:55 AM
> To: user@hbase.apache.org
> Subject: Re: Why so many unexpected files like partitions_xxxx are created?
>
> Should bulk load task clean up partitions_xxxx upon completion ?
>
> Cheers
>
>
> On Mon, Dec 16, 2013 at 6:53 PM, Bijieshan <bijieshan@huawei.com> wrote:
>
> > >  I think I should delete these files immediately after I have
> > > finished
> > bulk loading data into HBase since they are useless at that time, right ?
> >
> > Ya. I think so. They are useless once bulk load task finished.
> >
> > Jieshan.
> > -----Original Message-----
> > From: Tao Xiao [mailto:xiaotao.cs.nju@gmail.com]
> > Sent: Tuesday, December 17, 2013 9:34 AM
> > To: user@hbase.apache.org
> > Subject: Re: Why so many unexpected files like partitions_xxxx are
> created?
> >
> > Indeed these files are produced by org.apache.hadoop.hbase.mapreduce.
> > LoadIncrementalHFiles in the directory specified by what
> > job.getWorkingDirectory()
> > returns, and I think I should delete these files immediately after I
> > have finished bulk loading data into HBase since they are useless at
> > that time, right ?
> >
> >
> >
> >
> > 2013/12/16 Bijieshan <bijieshan@huawei.com>
> >
> > > The reduce partition information is stored in this partition_XXXX file.
> > > See the below code:
> > >
> > > HFileOutputFormat#configureIncrementalLoad:
> > >         .....................
> > >     Path partitionsPath = new Path(job.getWorkingDirectory(),
> > >                                    "partitions_" + UUID.randomUUID());
> > >     LOG.info("Writing partition information to " + partitionsPath);
> > >
> > >     FileSystem fs = partitionsPath.getFileSystem(conf);
> > >     writePartitions(conf, partitionsPath, startKeys);
> > >         .....................
> > >
> > > Hoping it helps.
> > >
> > > Jieshan
> > > -----Original Message-----
> > > From: Tao Xiao [mailto:xiaotao.cs.nju@gmail.com]
> > > Sent: Monday, December 16, 2013 6:48 PM
> > > To: user@hbase.apache.org
> > > Subject: Why so many unexpected files like partitions_xxxx are created?
> > >
> > > I imported data into HBase in the fashion of bulk load,  but after
> > > that I found many unexpected file were created in the HDFS directory
> > > of /user/root/, and they like these:
> > >
> > > /user/root/partitions_fd74866b-6588-468d-8463-474e202db070
> > > /user/root/partitions_fd867cd2-d9c9-48f5-9eec-185b2e57788d
> > > /user/root/partitions_fda37b8a-a882-4787-babc-8310a969f85c
> > > /user/root/partitions_fdaca2f4-2792-41f6-b7e8-61a8a5677dea
> > > /user/root/partitions_fdd55baa-3a12-493e-8844-a23ae83209c5
> > > /user/root/partitions_fdd85a3c-9abe-45d4-a0c6-76d2bed88ea5
> > > /user/root/partitions_fe133460-5f3f-4c6a-9fff-ff6c62410cc1
> > > /user/root/partitions_fe29a2b0-b281-465f-8d4a-6044822d960a
> > > /user/root/partitions_fe2fa6fa-9066-484c-bc91-ec412e48d008
> > > /user/root/partitions_fe31667b-2d5a-452e-baf7-a81982fe954a
> > > /user/root/partitions_fe3a5542-bc4d-4137-9d5e-1a0c59f72ac3
> > > /user/root/partitions_fe6a9407-c27b-4a67-bb50-e6b9fd172bc9
> > > /user/root/partitions_fe6f9294-f970-473c-8659-c08292c27ddd
> > > ... ...
> > > ... ...
> > >
> > >
> > > It seems that they are HFiles, but I don't know why the were created
> > here?
> > >
> > > I bulk load data into HBase in the following way:
> > >
> > > Firstly,   I wrote a MapReduce program which only has map tasks. The
> map
> > > tasks read some text data and emit them in the form of  RowKey and
> > > KeyValue.The following is my program:
> > >
> > >         @Override
> > >         protected void map(NullWritable NULL, GtpcV1SignalWritable
> > > signal, Context ctx) throws InterruptedException, IOException {
> > >             String strRowkey = xxx;
> > >             byte[] rowkeyBytes = Bytes.toBytes(strRowkey);
> > >
> > >             rowkey.set(rowkeyBytes);
> > >
> > >             part1.init(signal);
> > >             part2.init(signal);
> > >
> > >             KeyValue kv = new KeyValue(rowkeyBytes, Family_A,
> > > Qualifier_Q, part1.serialize());
> > >             ctx.write(rowkey, kv);
> > >
> > >             kv = new KeyValue(rowkeyBytes, Family_B, Qualifier_Q,
> > > part2.serialize());
> > >             ctx.write(rowkey, kv);
> > >         }
> > >
> > >
> > > after the MR programs finished, there were several HFiles generated
> > > in the output directory I specified.
> > >
> > > Then I bean to load these HFiles into HBase using the following
> command:
> > >        hbase org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles
> > > HFiles-Dir  MyTable
> > >
> > > Finally , I could see that the data were indeed loaded into the
> > > table in HBase.
> > >
> > >
> > > But, I could also see that there were many unexpected files
> > > generated in the HDFS directory of  /user/root/,  just as I have
> > > mentioned at the begining of this mail,  and I did not specify any
> > > files to be produced in this directory.
> > >
> > > What happened ? Who can tell me what there files are and who
> > > produced
> > them?
> > >
> > > Thanks
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message