hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tao Xiao <xiaotao.cs....@gmail.com>
Subject Re: Why so many unexpected files like partitions_xxxx are created?
Date Fri, 20 Dec 2013 06:18:17 GMT
Hi Ted,
     You let me check the log of LoadIncrementalHFiles to see what was the
error from region server, but where is the log of LoadIncrementalHFiles? Is
it in written into the log of region server? It seems the region server
works well




2013/12/19 Ted Yu <yuzhihong@gmail.com>

> From the stack trace posted I saw:
>
> org.apache.commons.logging.impl.Log4JLogger.error(Log4JLogger.java:257)
>     at
>
> org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.tryAtomicRegionLoad(
> LoadIncrementalHFiles.java:577)
>
> Assuming 0.94 is used, line 577 at the tip of 0.94 is:
>         LOG.warn("Attempt to bulk load region containing "
>             + Bytes.toStringBinary(first) + " into table "
>
> But the following should be the corresponding line w.r.t. stack trace:
>     } catch (IOException e) {
>       LOG.error("Encountered unrecoverable error from region server", e);
>
> Tao:
> Can you check the log of LoadIncrementalHFiles to see what was the error
> from region server ?
>
> As Jieshan said, checking region server log would reveal something.
>
> Cheers
>
>
> On Tue, Dec 17, 2013 at 10:40 PM, Bijieshan <bijieshan@huawei.com> wrote:
>
> > It seems LoadIncrementalHFiles is still running.  Can you run "jstack" on
> > 1 RegionServer process also?
> >
> > Which version are you using?
> >
> > Jieshan.
> > -----Original Message-----
> > From: Tao Xiao [mailto:xiaotao.cs.nju@gmail.com]
> > Sent: Wednesday, December 18, 2013 1:49 PM
> > To: user@hbase.apache.org
> > Subject: Re: Why so many unexpected files like partitions_xxxx are
> created?
> >
> > I did jstack one such process and can see the following output in the
> > terminal, and I guess this info told us that the processes started by the
> > command "LoadIncrementalHFiles" never exit. Why didn't they exit after
> > finished running ?
> >
> > ... ...
> > ... ...
> >
> > "LoadIncrementalHFiles-0.LruBlockCache.EvictionThread" daemon prio=10
> > tid=0x000000004129c000 nid=0x2186 in Object.wait() [0x00007f53f3665000]
> >    java.lang.Thread.State: WAITING (on object monitor)
> >     at java.lang.Object.wait(Native Method)
> >     - waiting on <0x000000075fcf3370> (a
> > org.apache.hadoop.hbase.io.hfile.LruBlockCache$EvictionThread)
> >     at java.lang.Object.wait(Object.java:485)
> >     at
> >
> >
> org.apache.hadoop.hbase.io.hfile.LruBlockCache$EvictionThread.run(LruBlockCache.java:631)
> >     - locked <0x000000075fcf3370> (a
> > org.apache.hadoop.hbase.io.hfile.LruBlockCache$EvictionThread)
> >     at java.lang.Thread.run(Thread.java:662)
> >
> >    Locked ownable synchronizers:
> >     - None
> >
> > "LoadIncrementalHFiles-3" prio=10 tid=0x00007f540ca55800 nid=0x2185
> > runnable [0x00007f53f3765000]
> >    java.lang.Thread.State: RUNNABLE
> >     at java.io.FileOutputStream.writeBytes(Native Method)
> >     at java.io.FileOutputStream.write(FileOutputStream.java:282)
> >     at java.io.BufferedOutputStream.write(BufferedOutputStream.java:105)
> >     - locked <0x0000000763e5af70> (a java.io.BufferedOutputStream)
> >     at java.io.PrintStream.write(PrintStream.java:430)
> >     - locked <0x0000000763d5b670> (a java.io.PrintStream)
> >     at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:202)
> >     at sun.nio.cs.StreamEncoder.implWrite(StreamEncoder.java:263)
> >     at sun.nio.cs.StreamEncoder.write(StreamEncoder.java:106)
> >     - locked <0x0000000763d6c6d0> (a java.io.OutputStreamWriter)
> >     at sun.nio.cs.StreamEncoder.write(StreamEncoder.java:116)
> >     at java.io.OutputStreamWriter.write(OutputStreamWriter.java:203)
> >     at java.io.Writer.write(Writer.java:140)
> >     at org.apache.log4j.helpers.QuietWriter.write(QuietWriter.java:48)
> >     at org.apache.log4j.WriterAppender.subAppend(WriterAppender.java:317)
> >     at org.apache.log4j.WriterAppender.append(WriterAppender.java:162)
> >     at
> > org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:251)
> >     - locked <0x0000000763d5fb90> (a org.apache.log4j.ConsoleAppender)
> >     at
> >
> >
> org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66)
> >     at org.apache.log4j.Category.callAppenders(Category.java:206)
> >     - locked <0x0000000763d65fe8> (a org.apache.log4j.spi.RootLogger)
> >     at org.apache.log4j.Category.forcedLog(Category.java:391)
> >     at org.apache.log4j.Category.log(Category.java:856)
> >     at
> > org.apache.commons.logging.impl.Log4JLogger.error(Log4JLogger.java:257)
> >     at
> >
> >
> org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.tryAtomicRegionLoad(LoadIncrementalHFiles.java:577)
> >     at
> >
> >
> org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$1.call(LoadIncrementalHFiles.java:316)
> >     at
> >
> >
> org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$1.call(LoadIncrementalHFiles.java:314)
> >     at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> >     at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> >     at
> >
> >
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
> >     at
> >
> >
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
> >     at java.lang.Thread.run(Thread.java:662)
> >
> >    Locked ownable synchronizers:
> >     - <0x000000075fe494c0> (a
> > java.util.concurrent.locks.ReentrantLock$NonfairSync)
> >
> > ... ...
> > ... ...
> >
> > "Reference Handler" daemon prio=10 tid=0x00007f540c138800 nid=0x2172 in
> > Object.wait() [0x00007f5401355000]
> >    java.lang.Thread.State: WAITING (on object monitor)
> >     at java.lang.Object.wait(Native Method)
> >     - waiting on <0x0000000763d51078> (a java.lang.ref.Reference$Lock)
> >     at java.lang.Object.wait(Object.java:485)
> >     at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:116)
> >     - locked <0x0000000763d51078> (a java.lang.ref.Reference$Lock)
> >
> >    Locked ownable synchronizers:
> >     - None
> >
> > "main" prio=10 tid=0x00007f540c00e000 nid=0x216a waiting on condition
> > [0x00007f54114ac000]
> >    java.lang.Thread.State: WAITING (parking)
> >     at sun.misc.Unsafe.park(Native Method)
> >     - parking to wait for  <0x000000075ea67310> (a
> > java.util.concurrent.FutureTask$Sync)
> >     at java.util.concurrent.locks.LockSupport.park(LockSupport.java:156)
> >     at
> >
> >
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:811)
> >     at
> >
> >
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:969)
> >     at
> >
> >
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1281)
> >     at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:218)
> >     at java.util.concurrent.FutureTask.get(FutureTask.java:83)
> >     at
> >
> >
> org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.bulkLoadPhase(LoadIncrementalHFiles.java:326)
> >     at
> >
> >
> org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.doBulkLoad(LoadIncrementalHFiles.java:261)
> >     at
> >
> >
> org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.run(LoadIncrementalHFiles.java:780)
> >     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> >     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
> >     at
> >
> >
> org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.main(LoadIncrementalHFiles.java:785)
> >
> >    Locked ownable synchronizers:
> >     - None
> >
> > "VM Thread" prio=10 tid=0x00007f540c132000 nid=0x2170 runnable
> >
> > "Gang worker#0 (Parallel GC Threads)" prio=10 tid=0x00007f540c01c800
> > nid=0x216b runnable
> >
> > "Gang worker#1 (Parallel GC Threads)" prio=10 tid=0x00007f540c01e800
> > nid=0x216c runnable
> >
> > "Gang worker#2 (Parallel GC Threads)" prio=10 tid=0x00007f540c020000
> > nid=0x216d runnable
> >
> > "Gang worker#3 (Parallel GC Threads)" prio=10 tid=0x00007f540c022000
> > nid=0x216e runnable
> >
> > "Concurrent Mark-Sweep GC Thread" prio=10 tid=0x00007f540c0b1000
> > nid=0x216f runnable "VM Periodic Task Thread" prio=10
> > tid=0x00007f540c16b000 nid=0x217a waiting on condition
> >
> > JNI global references: 1118
> >
> >
> > 2013/12/18 Ted Yu <yuzhihong@gmail.com>
> >
> > > Tao:
> > > Can you jstack one such process next time you see them hanging ?
> > >
> > > Thanks
> > >
> > >
> > > On Tue, Dec 17, 2013 at 6:31 PM, Tao Xiao <xiaotao.cs.nju@gmail.com>
> > > wrote:
> > >
> > > > BTW, I noticed another problem. I bulk load data into HBase every
> > > > five minutes, but I found that whenever the following command was
> > executed
> > > >     hbase org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles
> > > > HFiles-Dir  MyTable
> > > >
> > > > there is a new process called "LoadIncrementalHFiles"
> > > >
> > > > I can see many processes called "LoadIncrementalHFiles" using the
> > > > command "jps" in the terminal, why are these processes still there
> > > > even after the command that bulk load HFiles into HBase has finished
> > > > executing ? I have
> > > to
> > > > kill them myself.
> > > >
> > > >
> > > > 2013/12/17 Bijieshan <bijieshan@huawei.com>
> > > >
> > > > > Yes, it should be cleaned up. But not included in current code in
> > > > > my understanding.
> > > > >
> > > > > Jieshan.
> > > > > -----Original Message-----
> > > > > From: Ted Yu [mailto:yuzhihong@gmail.com]
> > > > > Sent: Tuesday, December 17, 2013 10:55 AM
> > > > > To: user@hbase.apache.org
> > > > > Subject: Re: Why so many unexpected files like partitions_xxxx are
> > > > created?
> > > > >
> > > > > Should bulk load task clean up partitions_xxxx upon completion ?
> > > > >
> > > > > Cheers
> > > > >
> > > > >
> > > > > On Mon, Dec 16, 2013 at 6:53 PM, Bijieshan <bijieshan@huawei.com>
> > > wrote:
> > > > >
> > > > > > >  I think I should delete these files immediately after
I have
> > > > > > > finished
> > > > > > bulk loading data into HBase since they are useless at that
> > > > > > time,
> > > > right ?
> > > > > >
> > > > > > Ya. I think so. They are useless once bulk load task finished.
> > > > > >
> > > > > > Jieshan.
> > > > > > -----Original Message-----
> > > > > > From: Tao Xiao [mailto:xiaotao.cs.nju@gmail.com]
> > > > > > Sent: Tuesday, December 17, 2013 9:34 AM
> > > > > > To: user@hbase.apache.org
> > > > > > Subject: Re: Why so many unexpected files like partitions_xxxx
> > > > > > are
> > > > > created?
> > > > > >
> > > > > > Indeed these files are produced by
> > org.apache.hadoop.hbase.mapreduce.
> > > > > > LoadIncrementalHFiles in the directory specified by what
> > > > > > job.getWorkingDirectory()
> > > > > > returns, and I think I should delete these files immediately
> > > > > > after I have finished bulk loading data into HBase since they
> > > > > > are useless at that time, right ?
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > 2013/12/16 Bijieshan <bijieshan@huawei.com>
> > > > > >
> > > > > > > The reduce partition information is stored in this
> > > > > > > partition_XXXX
> > > > file.
> > > > > > > See the below code:
> > > > > > >
> > > > > > > HFileOutputFormat#configureIncrementalLoad:
> > > > > > >         .....................
> > > > > > >     Path partitionsPath = new Path(job.getWorkingDirectory(),
> > > > > > >                                    "partitions_" +
> > > > UUID.randomUUID());
> > > > > > >     LOG.info("Writing partition information to " +
> > > > > > > partitionsPath);
> > > > > > >
> > > > > > >     FileSystem fs = partitionsPath.getFileSystem(conf);
> > > > > > >     writePartitions(conf, partitionsPath, startKeys);
> > > > > > >         .....................
> > > > > > >
> > > > > > > Hoping it helps.
> > > > > > >
> > > > > > > Jieshan
> > > > > > > -----Original Message-----
> > > > > > > From: Tao Xiao [mailto:xiaotao.cs.nju@gmail.com]
> > > > > > > Sent: Monday, December 16, 2013 6:48 PM
> > > > > > > To: user@hbase.apache.org
> > > > > > > Subject: Why so many unexpected files like partitions_xxxx
are
> > > > created?
> > > > > > >
> > > > > > > I imported data into HBase in the fashion of bulk load,
 but
> > > > > > > after that I found many unexpected file were created in
the
> > > > > > > HDFS
> > > directory
> > > > > > > of /user/root/, and they like these:
> > > > > > >
> > > > > > > /user/root/partitions_fd74866b-6588-468d-8463-474e202db070
> > > > > > > /user/root/partitions_fd867cd2-d9c9-48f5-9eec-185b2e57788d
> > > > > > > /user/root/partitions_fda37b8a-a882-4787-babc-8310a969f85c
> > > > > > > /user/root/partitions_fdaca2f4-2792-41f6-b7e8-61a8a5677dea
> > > > > > > /user/root/partitions_fdd55baa-3a12-493e-8844-a23ae83209c5
> > > > > > > /user/root/partitions_fdd85a3c-9abe-45d4-a0c6-76d2bed88ea5
> > > > > > > /user/root/partitions_fe133460-5f3f-4c6a-9fff-ff6c62410cc1
> > > > > > > /user/root/partitions_fe29a2b0-b281-465f-8d4a-6044822d960a
> > > > > > > /user/root/partitions_fe2fa6fa-9066-484c-bc91-ec412e48d008
> > > > > > > /user/root/partitions_fe31667b-2d5a-452e-baf7-a81982fe954a
> > > > > > > /user/root/partitions_fe3a5542-bc4d-4137-9d5e-1a0c59f72ac3
> > > > > > > /user/root/partitions_fe6a9407-c27b-4a67-bb50-e6b9fd172bc9
> > > > > > > /user/root/partitions_fe6f9294-f970-473c-8659-c08292c27ddd
> > > > > > > ... ...
> > > > > > > ... ...
> > > > > > >
> > > > > > >
> > > > > > > It seems that they are HFiles, but I don't know why the
were
> > > created
> > > > > > here?
> > > > > > >
> > > > > > > I bulk load data into HBase in the following way:
> > > > > > >
> > > > > > > Firstly,   I wrote a MapReduce program which only has map
> tasks.
> > > The
> > > > > map
> > > > > > > tasks read some text data and emit them in the form of
 RowKey
> > > > > > > and KeyValue.The following is my program:
> > > > > > >
> > > > > > >         @Override
> > > > > > >         protected void map(NullWritable NULL,
> > > > > > > GtpcV1SignalWritable signal, Context ctx) throws
> > InterruptedException, IOException {
> > > > > > >             String strRowkey = xxx;
> > > > > > >             byte[] rowkeyBytes = Bytes.toBytes(strRowkey);
> > > > > > >
> > > > > > >             rowkey.set(rowkeyBytes);
> > > > > > >
> > > > > > >             part1.init(signal);
> > > > > > >             part2.init(signal);
> > > > > > >
> > > > > > >             KeyValue kv = new KeyValue(rowkeyBytes, Family_A,
> > > > > > > Qualifier_Q, part1.serialize());
> > > > > > >             ctx.write(rowkey, kv);
> > > > > > >
> > > > > > >             kv = new KeyValue(rowkeyBytes, Family_B,
> > > > > > > Qualifier_Q, part2.serialize());
> > > > > > >             ctx.write(rowkey, kv);
> > > > > > >         }
> > > > > > >
> > > > > > >
> > > > > > > after the MR programs finished, there were several HFiles
> > > > > > > generated in the output directory I specified.
> > > > > > >
> > > > > > > Then I bean to load these HFiles into HBase using the
> > > > > > > following
> > > > > command:
> > > > > > >        hbase
> > > org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles
> > > > > > > HFiles-Dir  MyTable
> > > > > > >
> > > > > > > Finally , I could see that the data were indeed loaded
into
> > > > > > > the table in HBase.
> > > > > > >
> > > > > > >
> > > > > > > But, I could also see that there were many unexpected files
> > > > > > > generated in the HDFS directory of  /user/root/,  just
as I
> > > > > > > have mentioned at the begining of this mail,  and I did
not
> > > > > > > specify any files to be produced in this directory.
> > > > > > >
> > > > > > > What happened ? Who can tell me what there files are and
who
> > > > > > > produced
> > > > > > them?
> > > > > > >
> > > > > > > Thanks
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message