Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 59E39DB93 for ; Thu, 15 Nov 2012 16:09:42 +0000 (UTC) Received: (qmail 39215 invoked by uid 500); 15 Nov 2012 16:09:40 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 39091 invoked by uid 500); 15 Nov 2012 16:09:40 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 39054 invoked by uid 99); 15 Nov 2012 16:09:38 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 15 Nov 2012 16:09:38 +0000 X-ASF-Spam-Status: No, hits=-0.1 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_MED,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of amits@infolinks.com designates 207.126.144.123 as permitted sender) Received: from [207.126.144.123] (HELO eu1sys200aog107.obsmtp.com) (207.126.144.123) by apache.org (qpsmtpd/0.29) with SMTP; Thu, 15 Nov 2012 16:09:30 +0000 Received: from mail-bk0-f69.google.com ([209.85.214.69]) (using TLSv1) by eu1sys200aob107.postini.com ([207.126.147.11]) with SMTP ID DSNKUKUTpeCHg9KCKJP6Y2DiuBnykvICnpz2@postini.com; Thu, 15 Nov 2012 16:09:10 UTC Received: by mail-bk0-f69.google.com with SMTP id j10so1443173bkw.8 for ; Thu, 15 Nov 2012 08:09:09 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:x-gm-message-state; bh=zgMbjJYQShKWu1mrp9IRxqIsWiiqXd2ewxqkxGk5soA=; b=Ok+s7lXNspaCKk4LFMsiCle4OKqIKIQeoQO1MjDkVV/H7GkVHcFWAtphkGT/ivcrjO PEHfo1RHuO3DxvcSwVAPj7fxlVOCMta8/UcyaYhqcndrFhlTN6OUV47B8lU6+3m57T3w cxsm8VrCBL3pU3nWog/P4HP2XYtBnAGTCxb7ufDaiU/2zqJmDOlRAq1IdbJ0Am+k79P6 bYZOelUBGSuzOMD0KQTL7oKYHpacEtYYySFKK5NM0y+0lVzfTy3Vrc8chMAY7zZWN6pt U0KBoslM+rsk/dsLp3/KmdgWTrskqN1whoFmrzSRUbdB1ji0GtSgpfVfqBOdbl1DnCfT dORQ== Received: by 10.152.106.237 with SMTP id gx13mr1452410lab.46.1352995749502; Thu, 15 Nov 2012 08:09:09 -0800 (PST) MIME-Version: 1.0 Received: by 10.152.106.237 with SMTP id gx13mr1452398lab.46.1352995749348; Thu, 15 Nov 2012 08:09:09 -0800 (PST) Received: by 10.114.38.204 with HTTP; Thu, 15 Nov 2012 08:09:09 -0800 (PST) In-Reply-To: References: Date: Thu, 15 Nov 2012 18:09:09 +0200 Message-ID: Subject: Re: Bulk load fails with NullPointerException From: Amit Sela To: user@hbase.apache.org Content-Type: multipart/alternative; boundary=f46d0407161305c53c04ce8adf4a X-Gm-Message-State: ALoCoQk8r5ZWP/VO7w8SbSRHiJtnbLNm5m3PPWwNV+OzN1Q3HGPNL8bsPKjvlvHfc0X84sxRH2bFsbQD7dVU0sx0Ay+hD/F8y+4eZlMZiWC3bER8XreguExKJ4U/sbZ3xZLi/2fNmCHFtSew7y0cgFEtv88lAYE6DQ== X-Virus-Checked: Checked by ClamAV on apache.org --f46d0407161305c53c04ce8adf4a Content-Type: text/plain; charset=ISO-8859-1 After some digging into the code it looks like this bug also affects bulk load when using LoadIncrementalHFiles (bulk loading programmatically). We fixed the code in Compression.class (in Algorithm): GZ("gz") { private transient GzipCodec codec; @Override DefaultCodec getCodec(Configuration conf) { if (codec == null) { synchronized (this) { if (codec == null) { codec = new ReusableStreamGzipCodec(new Configuration(conf)); } } } return codec; } } That way there is always configuration. In addition, since we pre-create regions before bulk loading, we wanted the MR job to relate only to these regions so by inheriting HFileOutputFormat you can set only the split points that are relevant to this job and save a lot of reduce time (especially if you have hundreds or thousands of regions). This works for us since each bulk load we do is relevant for a specific timestamp. Hope it helps anyone... Thanks. On Wed, Nov 7, 2012 at 9:44 AM, Amit Sela wrote: > Does this bug affect snappy as well ? maybe I'll just use it instead of GZ > (also recommended in the book). > > > On Tue, Nov 6, 2012 at 10:27 PM, Jean-Marc Spaggiari < > jean-marc@spaggiari.org> wrote: > >> I'm not talking about the major compation, but about the CF compaction. >> >> What's your table definition? Do you have the compaction (GZ) defined >> there? >> >> It seems there is some failure with this based on the stack trace. >> >> So if you disable it while you are doing your load, you should not >> face this again. Then you can alter your CF to re-activate it? >> >> 2012/11/6, Amit Sela : >> > Do you mean setting: hbase.hregion.majorcompaction to 0 ? >> > Because it's already set this way. We pre-create new regions before >> writing >> > to HBase and initiate a major compaction once a day. >> > >> > On Tue, Nov 6, 2012 at 8:51 PM, Jean-Marc Spaggiari >> > > >> wrote: >> > >> >> Maybe one option will be to disable the compaction, load the data, >> >> re-activate the compaction, major-compact the data? >> >> >> >> 2012/11/6, Amit Sela : >> >> > Seems like that's the one alright... Any ideas how to avoid it ? >> maybe >> >> > a >> >> > patch ? >> >> > >> >> > On Tue, Nov 6, 2012 at 8:05 PM, Jean-Daniel Cryans >> >> > wrote: >> >> > >> >> >> This sounds a lot like >> >> >> https://issues.apache.org/jira/browse/HBASE-5458 >> >> >> >> >> >> On Tue, Nov 6, 2012 at 2:28 AM, Amit Sela >> wrote: >> >> >> > Hi all, >> >> >> > >> >> >> > I'm trying to bulk load using LoadIncrementalHFiles and I get a >> >> >> > NullPointerException >> >> >> > at: >> >> >> >> >> >> org.apache.hadoop.io.compress.zlib.ZlibFactory.isNativeZlibLoaded(ZlibFactory.java:63). >> >> >> > >> >> >> > It looks like DefaultCodec has no set configuration... >> >> >> > >> >> >> > Anyone encounter this before ? >> >> >> > >> >> >> > Thanks. >> >> >> > >> >> >> >>>>>>>>Full exception thrown: >> >> >> > >> >> >> > java.util.concurrent.ExecutionException: >> >> java.lang.NullPointerException >> >> >> > at >> >> >> > java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222) >> >> >> > at java.util.concurrent.FutureTask.get(FutureTask.java:83) >> >> >> > at >> >> >> > >> >> >> >> >> >> org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.groupOrSplitPhase(LoadIncrementalHFiles.java:333) >> >> >> > at >> >> >> > >> >> >> >> >> >> org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.doBulkLoad(LoadIncrementalHFiles.java:232) >> >> >> > at >> >> >> > >> >> >> >> >> >> com.infolinks.hadoop.jobrunner.UrlsHadoopJobExecutor.executeURLJob(UrlsHadoopJobExecutor.java:204) >> >> >> > at >> >> >> > >> >> >> >> >> >> com.infolinks.hadoop.jobrunner.UrlsHadoopJobExecutor.runJobIgnoreSystemJournal(UrlsHadoopJobExecutor.java:86) >> >> >> > at >> >> >> > >> >> >> >> >> >> com.infolinks.hadoop.jobrunner.HadoopJobExecutor.main(HadoopJobExecutor.java:182) >> >> >> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> >> >> > at >> >> >> > >> >> >> >> >> >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) >> >> >> > at >> >> >> > >> >> >> >> >> >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >> >> >> > at java.lang.reflect.Method.invoke(Method.java:597) >> >> >> > at org.apache.hadoop.util.RunJar.main(RunJar.java:156) >> >> >> > Caused by: java.lang.NullPointerException >> >> >> > at >> >> >> > >> >> >> >> >> >> org.apache.hadoop.io.compress.zlib.ZlibFactory.isNativeZlibLoaded(ZlibFactory.java:63) >> >> >> > at >> >> >> > >> >> >> >> >> >> org.apache.hadoop.io.compress.GzipCodec.getDecompressorType(GzipCodec.java:142) >> >> >> > at >> >> >> > >> >> >> >> >> >> org.apache.hadoop.io.compress.CodecPool.getDecompressor(CodecPool.java:125) >> >> >> > at >> >> >> > >> >> >> >> >> >> org.apache.hadoop.hbase.io.hfile.Compression$Algorithm.getDecompressor(Compression.java:290) >> >> >> > at >> >> >> > >> >> >> >> >> >> org.apache.hadoop.hbase.io.hfile.HFileBlock$AbstractFSReader.decompress(HFileBlock.java:1391) >> >> >> > at >> >> >> > >> >> >> >> >> >> org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderV2.readBlockDataInternal(HFileBlock.java:1897) >> >> >> > at >> >> >> > >> >> >> >> >> >> org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderV2.readBlockData(HFileBlock.java:1637) >> >> >> > at >> >> >> > >> >> >> >> >> >> org.apache.hadoop.hbase.io.hfile.HFileBlock$AbstractFSReader$1.nextBlock(HFileBlock.java:1286) >> >> >> > at >> >> >> > >> >> >> >> >> >> org.apache.hadoop.hbase.io.hfile.HFileBlock$AbstractFSReader$1.nextBlockWithBlockType(HFileBlock.java:1294) >> >> >> > at >> >> >> > >> >> >> >> >> >> org.apache.hadoop.hbase.io.hfile.HFileReaderV2.(HFileReaderV2.java:126) >> >> >> > at >> >> >> >> org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:552) >> >> >> > at >> >> >> > >> >> >> >> >> >> org.apache.hadoop.hbase.io.hfile.HFile.createReaderWithEncoding(HFile.java:589) >> >> >> > at >> >> >> > >> org.apache.hadoop.hbase.io.hfile.HFile.createReader(HFile.java:603) >> >> >> > at >> >> >> > >> >> >> >> >> >> org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.groupOrSplit(LoadIncrementalHFiles.java:402) >> >> >> > at >> >> >> > >> >> >> >> >> >> org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$2.call(LoadIncrementalHFiles.java:323) >> >> >> > at >> >> >> > >> >> >> >> >> >> org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$2.call(LoadIncrementalHFiles.java:321) >> >> >> > at >> >> >> > java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) >> >> >> > at java.util.concurrent.FutureTask.run(FutureTask.java:138) >> >> >> > at >> >> >> > >> >> >> >> >> >> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) >> >> >> > at >> >> >> > >> >> >> >> >> >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) >> >> >> > at java.lang.Thread.run(Thread.java:662) >> >> >> > 12/11/06 10:21:50 ERROR jobrunner.UrlsHadoopJobExecutor: >> >> >> jobCompleteStatus: >> >> >> > false >> >> >> > java.lang.RuntimeException: java.lang.IllegalStateException: >> >> >> > java.lang.NullPointerException >> >> >> > at >> >> >> > >> >> >> >> >> >> com.infolinks.hadoop.jobrunner.UrlsHadoopJobExecutor.executeURLJob(UrlsHadoopJobExecutor.java:210) >> >> >> > at >> >> >> > >> >> >> >> >> >> com.infolinks.hadoop.jobrunner.UrlsHadoopJobExecutor.runJobIgnoreSystemJournal(UrlsHadoopJobExecutor.java:86) >> >> >> > at >> >> >> > >> >> >> >> >> >> com.infolinks.hadoop.jobrunner.HadoopJobExecutor.main(HadoopJobExecutor.java:182) >> >> >> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> >> >> > at >> >> >> > >> >> >> >> >> >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) >> >> >> > at >> >> >> > >> >> >> >> >> >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >> >> >> > at java.lang.reflect.Method.invoke(Method.java:597) >> >> >> > at org.apache.hadoop.util.RunJar.main(RunJar.java:156) >> >> >> > Caused by: java.lang.IllegalStateException: >> >> >> java.lang.NullPointerException >> >> >> > at >> >> >> > >> >> >> >> >> >> org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.groupOrSplitPhase(LoadIncrementalHFiles.java:344) >> >> >> > at >> >> >> > >> >> >> >> >> >> org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.doBulkLoad(LoadIncrementalHFiles.java:232) >> >> >> > at >> >> >> > >> >> >> >> >> >> com.infolinks.hadoop.jobrunner.UrlsHadoopJobExecutor.executeURLJob(UrlsHadoopJobExecutor.java:204) >> >> >> > ... 7 more >> >> >> > Caused by: java.lang.NullPointerException >> >> >> > at >> >> >> > >> >> >> >> >> >> org.apache.hadoop.io.compress.zlib.ZlibFactory.isNativeZlibLoaded(ZlibFactory.java:63) >> >> >> > at >> >> >> > >> >> >> >> >> >> org.apache.hadoop.io.compress.GzipCodec.getDecompressorType(GzipCodec.java:142) >> >> >> > at >> >> >> > >> >> >> >> >> >> org.apache.hadoop.io.compress.CodecPool.getDecompressor(CodecPool.java:125) >> >> >> > at >> >> >> > >> >> >> >> >> >> org.apache.hadoop.hbase.io.hfile.Compression$Algorithm.getDecompressor(Compression.java:290) >> >> >> > at >> >> >> > >> >> >> >> >> >> org.apache.hadoop.hbase.io.hfile.HFileBlock$AbstractFSReader.decompress(HFileBlock.java:1391) >> >> >> > at >> >> >> > >> >> >> >> >> >> org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderV2.readBlockDataInternal(HFileBlock.java:1897) >> >> >> > at >> >> >> > >> >> >> >> >> >> org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderV2.readBlockData(HFileBlock.java:1637) >> >> >> > at >> >> >> > >> >> >> >> >> >> org.apache.hadoop.hbase.io.hfile.HFileBlock$AbstractFSReader$1.nextBlock(HFileBlock.java:1286) >> >> >> > at >> >> >> > >> >> >> >> >> >> org.apache.hadoop.hbase.io.hfile.HFileBlock$AbstractFSReader$1.nextBlockWithBlockType(HFileBlock.java:1294) >> >> >> > at >> >> >> > >> >> >> >> >> >> org.apache.hadoop.hbase.io.hfile.HFileReaderV2.(HFileReaderV2.java:126) >> >> >> > at >> >> >> >> org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:552) >> >> >> > at >> >> >> > >> >> >> >> >> >> org.apache.hadoop.hbase.io.hfile.HFile.createReaderWithEncoding(HFile.java:589) >> >> >> > at >> >> >> > >> org.apache.hadoop.hbase.io.hfile.HFile.createReader(HFile.java:603) >> >> >> > at >> >> >> > >> >> >> >> >> >> org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.groupOrSplit(LoadIncrementalHFiles.java:402) >> >> >> > at >> >> >> > >> >> >> >> >> >> org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$2.call(LoadIncrementalHFiles.java:323) >> >> >> > at >> >> >> > >> >> >> >> >> >> org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$2.call(LoadIncrementalHFiles.java:321) >> >> >> > at >> >> >> > java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) >> >> >> > at java.util.concurrent.FutureTask.run(FutureTask.java:138) >> >> >> > at >> >> >> > >> >> >> >> >> >> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) >> >> >> > at >> >> >> > >> >> >> >> >> >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) >> >> >> > at java.lang.Thread.run(Thread.java:662) >> >> >> >> >> > >> >> >> > >> > > --f46d0407161305c53c04ce8adf4a--