hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Amit Sela <am...@infolinks.com>
Subject Re: Bulk load fails with NullPointerException
Date Thu, 15 Nov 2012 16:09:09 GMT
After some digging into the code it looks like this bug also affects bulk
load when using LoadIncrementalHFiles (bulk loading programmatically).
We  fixed the code in Compression.class (in Algorithm):

GZ("gz") {
            private transient GzipCodec codec;

            @Override
            DefaultCodec getCodec(Configuration conf) {
                if (codec == null) {
                    synchronized (this) {
                        if (codec == null) {
                            codec = new ReusableStreamGzipCodec(new
Configuration(conf));
                        }
                    }
                }
                return codec;
            }
        }

That way there is always configuration.

In addition, since we pre-create regions before bulk loading, we wanted the
MR job to relate only to these regions so by inheriting HFileOutputFormat
you can set only the split points that are relevant to this job and save a
lot of reduce time (especially if you have hundreds or thousands of
regions).
This works for us since each bulk load we do is relevant for a specific
timestamp. Hope it helps anyone...

Thanks.

On Wed, Nov 7, 2012 at 9:44 AM, Amit Sela <amits@infolinks.com> wrote:

> Does this bug affect snappy as well ? maybe I'll just use it instead of GZ
> (also recommended in the book).
>
>
> On Tue, Nov 6, 2012 at 10:27 PM, Jean-Marc Spaggiari <
> jean-marc@spaggiari.org> wrote:
>
>> I'm not talking about the major compation, but about the CF compaction.
>>
>> What's your table definition? Do you have the compaction (GZ) defined
>> there?
>>
>> It seems there is some failure with this based on the stack trace.
>>
>> So if you disable it while you are doing your load, you should not
>> face this again. Then you can alter your CF to re-activate it?
>>
>> 2012/11/6, Amit Sela <amits@infolinks.com>:
>> > Do you mean setting: hbase.hregion.majorcompaction to 0 ?
>> > Because it's already set this way. We pre-create new regions before
>> writing
>> > to HBase and initiate a major compaction once a day.
>> >
>> > On Tue, Nov 6, 2012 at 8:51 PM, Jean-Marc Spaggiari
>> > <jean-marc@spaggiari.org
>> >> wrote:
>> >
>> >> Maybe one option will be to disable the compaction, load the data,
>> >> re-activate the compaction, major-compact the data?
>> >>
>> >> 2012/11/6, Amit Sela <amits@infolinks.com>:
>> >> > Seems like that's the one alright... Any ideas how to avoid it ?
>> maybe
>> >> > a
>> >> > patch ?
>> >> >
>> >> > On Tue, Nov 6, 2012 at 8:05 PM, Jean-Daniel Cryans
>> >> > <jdcryans@apache.org>wrote:
>> >> >
>> >> >> This sounds a lot like
>> >> >> https://issues.apache.org/jira/browse/HBASE-5458
>> >> >>
>> >> >> On Tue, Nov 6, 2012 at 2:28 AM, Amit Sela <amits@infolinks.com>
>> wrote:
>> >> >> > Hi all,
>> >> >> >
>> >> >> > I'm trying to bulk load using LoadIncrementalHFiles and I
get a
>> >> >> > NullPointerException
>> >> >> > at:
>> >> >>
>> >>
>> org.apache.hadoop.io.compress.zlib.ZlibFactory.isNativeZlibLoaded(ZlibFactory.java:63).
>> >> >> >
>> >> >> > It looks like DefaultCodec has no set configuration...
>> >> >> >
>> >> >> > Anyone encounter this before ?
>> >> >> >
>> >> >> > Thanks.
>> >> >> >
>> >> >> >>>>>>>>Full exception thrown:
>> >> >> >
>> >> >> > java.util.concurrent.ExecutionException:
>> >> java.lang.NullPointerException
>> >> >> > at
>> >> >> > java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
>> >> >> > at java.util.concurrent.FutureTask.get(FutureTask.java:83)
>> >> >> > at
>> >> >> >
>> >> >>
>> >>
>> org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.groupOrSplitPhase(LoadIncrementalHFiles.java:333)
>> >> >> > at
>> >> >> >
>> >> >>
>> >>
>> org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.doBulkLoad(LoadIncrementalHFiles.java:232)
>> >> >> > at
>> >> >> >
>> >> >>
>> >>
>> com.infolinks.hadoop.jobrunner.UrlsHadoopJobExecutor.executeURLJob(UrlsHadoopJobExecutor.java:204)
>> >> >> > at
>> >> >> >
>> >> >>
>> >>
>> com.infolinks.hadoop.jobrunner.UrlsHadoopJobExecutor.runJobIgnoreSystemJournal(UrlsHadoopJobExecutor.java:86)
>> >> >> > at
>> >> >> >
>> >> >>
>> >>
>> com.infolinks.hadoop.jobrunner.HadoopJobExecutor.main(HadoopJobExecutor.java:182)
>> >> >> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> >> >> > at
>> >> >> >
>> >> >>
>> >>
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>> >> >> > at
>> >> >> >
>> >> >>
>> >>
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>> >> >> > at java.lang.reflect.Method.invoke(Method.java:597)
>> >> >> > at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
>> >> >> > Caused by: java.lang.NullPointerException
>> >> >> > at
>> >> >> >
>> >> >>
>> >>
>> org.apache.hadoop.io.compress.zlib.ZlibFactory.isNativeZlibLoaded(ZlibFactory.java:63)
>> >> >> > at
>> >> >> >
>> >> >>
>> >>
>> org.apache.hadoop.io.compress.GzipCodec.getDecompressorType(GzipCodec.java:142)
>> >> >> > at
>> >> >> >
>> >> >>
>> >>
>> org.apache.hadoop.io.compress.CodecPool.getDecompressor(CodecPool.java:125)
>> >> >> > at
>> >> >> >
>> >> >>
>> >>
>> org.apache.hadoop.hbase.io.hfile.Compression$Algorithm.getDecompressor(Compression.java:290)
>> >> >> > at
>> >> >> >
>> >> >>
>> >>
>> org.apache.hadoop.hbase.io.hfile.HFileBlock$AbstractFSReader.decompress(HFileBlock.java:1391)
>> >> >> > at
>> >> >> >
>> >> >>
>> >>
>> org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderV2.readBlockDataInternal(HFileBlock.java:1897)
>> >> >> > at
>> >> >> >
>> >> >>
>> >>
>> org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderV2.readBlockData(HFileBlock.java:1637)
>> >> >> > at
>> >> >> >
>> >> >>
>> >>
>> org.apache.hadoop.hbase.io.hfile.HFileBlock$AbstractFSReader$1.nextBlock(HFileBlock.java:1286)
>> >> >> > at
>> >> >> >
>> >> >>
>> >>
>> org.apache.hadoop.hbase.io.hfile.HFileBlock$AbstractFSReader$1.nextBlockWithBlockType(HFileBlock.java:1294)
>> >> >> > at
>> >> >> >
>> >> >>
>> >>
>> org.apache.hadoop.hbase.io.hfile.HFileReaderV2.<init>(HFileReaderV2.java:126)
>> >> >> > at
>> >> >>
>> org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:552)
>> >> >> > at
>> >> >> >
>> >> >>
>> >>
>> org.apache.hadoop.hbase.io.hfile.HFile.createReaderWithEncoding(HFile.java:589)
>> >> >> > at
>> >> >> >
>> org.apache.hadoop.hbase.io.hfile.HFile.createReader(HFile.java:603)
>> >> >> > at
>> >> >> >
>> >> >>
>> >>
>> org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.groupOrSplit(LoadIncrementalHFiles.java:402)
>> >> >> > at
>> >> >> >
>> >> >>
>> >>
>> org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$2.call(LoadIncrementalHFiles.java:323)
>> >> >> > at
>> >> >> >
>> >> >>
>> >>
>> org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$2.call(LoadIncrementalHFiles.java:321)
>> >> >> > at
>> >> >> > java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>> >> >> > at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>> >> >> > at
>> >> >> >
>> >> >>
>> >>
>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>> >> >> > at
>> >> >> >
>> >> >>
>> >>
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>> >> >> > at java.lang.Thread.run(Thread.java:662)
>> >> >> > 12/11/06 10:21:50 ERROR jobrunner.UrlsHadoopJobExecutor:
>> >> >> jobCompleteStatus:
>> >> >> > false
>> >> >> > java.lang.RuntimeException: java.lang.IllegalStateException:
>> >> >> > java.lang.NullPointerException
>> >> >> > at
>> >> >> >
>> >> >>
>> >>
>> com.infolinks.hadoop.jobrunner.UrlsHadoopJobExecutor.executeURLJob(UrlsHadoopJobExecutor.java:210)
>> >> >> > at
>> >> >> >
>> >> >>
>> >>
>> com.infolinks.hadoop.jobrunner.UrlsHadoopJobExecutor.runJobIgnoreSystemJournal(UrlsHadoopJobExecutor.java:86)
>> >> >> > at
>> >> >> >
>> >> >>
>> >>
>> com.infolinks.hadoop.jobrunner.HadoopJobExecutor.main(HadoopJobExecutor.java:182)
>> >> >> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> >> >> > at
>> >> >> >
>> >> >>
>> >>
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>> >> >> > at
>> >> >> >
>> >> >>
>> >>
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>> >> >> > at java.lang.reflect.Method.invoke(Method.java:597)
>> >> >> > at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
>> >> >> > Caused by: java.lang.IllegalStateException:
>> >> >> java.lang.NullPointerException
>> >> >> > at
>> >> >> >
>> >> >>
>> >>
>> org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.groupOrSplitPhase(LoadIncrementalHFiles.java:344)
>> >> >> > at
>> >> >> >
>> >> >>
>> >>
>> org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.doBulkLoad(LoadIncrementalHFiles.java:232)
>> >> >> > at
>> >> >> >
>> >> >>
>> >>
>> com.infolinks.hadoop.jobrunner.UrlsHadoopJobExecutor.executeURLJob(UrlsHadoopJobExecutor.java:204)
>> >> >> > ... 7 more
>> >> >> > Caused by: java.lang.NullPointerException
>> >> >> > at
>> >> >> >
>> >> >>
>> >>
>> org.apache.hadoop.io.compress.zlib.ZlibFactory.isNativeZlibLoaded(ZlibFactory.java:63)
>> >> >> > at
>> >> >> >
>> >> >>
>> >>
>> org.apache.hadoop.io.compress.GzipCodec.getDecompressorType(GzipCodec.java:142)
>> >> >> > at
>> >> >> >
>> >> >>
>> >>
>> org.apache.hadoop.io.compress.CodecPool.getDecompressor(CodecPool.java:125)
>> >> >> > at
>> >> >> >
>> >> >>
>> >>
>> org.apache.hadoop.hbase.io.hfile.Compression$Algorithm.getDecompressor(Compression.java:290)
>> >> >> > at
>> >> >> >
>> >> >>
>> >>
>> org.apache.hadoop.hbase.io.hfile.HFileBlock$AbstractFSReader.decompress(HFileBlock.java:1391)
>> >> >> > at
>> >> >> >
>> >> >>
>> >>
>> org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderV2.readBlockDataInternal(HFileBlock.java:1897)
>> >> >> > at
>> >> >> >
>> >> >>
>> >>
>> org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderV2.readBlockData(HFileBlock.java:1637)
>> >> >> > at
>> >> >> >
>> >> >>
>> >>
>> org.apache.hadoop.hbase.io.hfile.HFileBlock$AbstractFSReader$1.nextBlock(HFileBlock.java:1286)
>> >> >> > at
>> >> >> >
>> >> >>
>> >>
>> org.apache.hadoop.hbase.io.hfile.HFileBlock$AbstractFSReader$1.nextBlockWithBlockType(HFileBlock.java:1294)
>> >> >> > at
>> >> >> >
>> >> >>
>> >>
>> org.apache.hadoop.hbase.io.hfile.HFileReaderV2.<init>(HFileReaderV2.java:126)
>> >> >> > at
>> >> >>
>> org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:552)
>> >> >> > at
>> >> >> >
>> >> >>
>> >>
>> org.apache.hadoop.hbase.io.hfile.HFile.createReaderWithEncoding(HFile.java:589)
>> >> >> > at
>> >> >> >
>> org.apache.hadoop.hbase.io.hfile.HFile.createReader(HFile.java:603)
>> >> >> > at
>> >> >> >
>> >> >>
>> >>
>> org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.groupOrSplit(LoadIncrementalHFiles.java:402)
>> >> >> > at
>> >> >> >
>> >> >>
>> >>
>> org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$2.call(LoadIncrementalHFiles.java:323)
>> >> >> > at
>> >> >> >
>> >> >>
>> >>
>> org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$2.call(LoadIncrementalHFiles.java:321)
>> >> >> > at
>> >> >> > java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>> >> >> > at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>> >> >> > at
>> >> >> >
>> >> >>
>> >>
>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>> >> >> > at
>> >> >> >
>> >> >>
>> >>
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>> >> >> > at java.lang.Thread.run(Thread.java:662)
>> >> >>
>> >> >
>> >>
>> >
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message