Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A4DFE109EC for ; Mon, 19 Aug 2013 14:11:56 +0000 (UTC) Received: (qmail 59847 invoked by uid 500); 19 Aug 2013 14:11:53 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 59765 invoked by uid 500); 19 Aug 2013 14:11:53 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 59755 invoked by uid 99); 19 Aug 2013 14:11:52 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 19 Aug 2013 14:11:52 +0000 X-ASF-Spam-Status: No, hits=2.5 required=5.0 tests=FREEMAIL_REPLY,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of shahab.yunus@gmail.com designates 209.85.214.41 as permitted sender) Received: from [209.85.214.41] (HELO mail-bk0-f41.google.com) (209.85.214.41) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 19 Aug 2013 14:11:48 +0000 Received: by mail-bk0-f41.google.com with SMTP id na10so1514665bkb.0 for ; Mon, 19 Aug 2013 07:11:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=1KYjqlvLfyTFsc73A78Z1UwdOOrGKIcq9F7i3V+VDwo=; b=AyQ53jJX9iidpzAyvkwn/O0wmDtQAPFEmMnuKAauR3R0OpKuibtre++h3MIryP16XY qu/eGGHNQZuWgbJLcvo7rsHpgvG/n0NRVZdzss4AY0G7bM7tTBOqVsJKiCkoQQgGJbWQ BsKAdlzVZKBrLC4GQ6t/ayNEaApI0L6CLwKCxB1kvjGSfD8u9PByN7YgiKUpjKhg1XBl bEJw4Qy7kYplG35kt1k7EZVmZwp+k/SecPA1vOBp/nej9X3iOWUibwHPx5m8fILC83bK v2+FDtTN2KAK3gmF0QUrW1fCUka/K6x7a+5jPjjq0eu7wjXymQPJnjRwwsu4z1pAAm3x O4Xg== MIME-Version: 1.0 X-Received: by 10.204.234.5 with SMTP id ka5mr3691907bkb.5.1376921486744; Mon, 19 Aug 2013 07:11:26 -0700 (PDT) Received: by 10.204.231.76 with HTTP; Mon, 19 Aug 2013 07:11:26 -0700 (PDT) In-Reply-To: References: Date: Mon, 19 Aug 2013 10:11:26 -0400 Message-ID: Subject: Re: Java Null Pointer Exception! From: Shahab Yunus To: "user@hbase.apache.org" Content-Type: multipart/alternative; boundary=20cf301cbf821a0c3a04e44d8400 X-Virus-Checked: Checked by ClamAV on apache.org --20cf301cbf821a0c3a04e44d8400 Content-Type: text/plain; charset=ISO-8859-1 I think you should not try to join the tables this way. It will be against the recommended design/pattern of HBase (joins in HBase alone go against the design) and M/R. You should first, maybe through another M/R job or PIg script, for example, pre-process data and massage it into a uniform or appropriate structure conforming to the M/R architecture (maybe convert them into ext files first?) Have you looked into the recommended M/R join strategies? Some links to start with: http://codingjunkie.net/mapreduce-reduce-joins/ http://chamibuddhika.wordpress.com/2012/02/26/joins-with-map-reduce/ http://blog.matthewrathbone.com/2013/02/09/real-world-hadoop-implementing-a-left-outer-join-in-hadoop-map-reduce.html Regards, Shahab On Mon, Aug 19, 2013 at 9:43 AM, Pavan Sudheendra wrote: > I'm basically trying to do a join across 3 tables in the mapper.. In the > reducer i am doing a group by and writing the output to another table.. > > Although, i agree that my code is pathetic, what i could actually do is > create a HTable object once and pass it as an extra argument to the map > function.. But, would that solve the problem? > > Roughly these are my tables and the code flows like this > Mapper-> Table1 -> Contentidx ->Content -> Mapper aggregates the values -> > Reducer. > > > Table1 -> 19 Million rows. > Contentidx table - 150k rows. > Content table - 93k rows. > > Yes, i have looked at the map-reduce example given by the hbase website and > that is how i am following. > > > > On Mon, Aug 19, 2013 at 7:05 PM, Shahab Yunus >wrote: > > > Can you please explain or show the flow of the code a bit more? Why are > you > > create the HTable object again and again in the mapper? Where is > > ContentidxTable > > (the name of the table, I believe?) defined? What is your actually > > requirement? > > > > Also, have you looked into this, the api for wiring HBase tables with M/R > > jobs? > > http://hbase.apache.org/book/mapreduce.example.html > > > > Regards, > > Shahab > > > > > > On Mon, Aug 19, 2013 at 9:05 AM, Pavan Sudheendra > >wrote: > > > > > Also, the same code works perfectly fine when i run it in single node > > > cluster. I've added the hbase classpath to HADOOP_CLASSPATH and have > set > > > all the other env variables also.. > > > > > > > > > On Mon, Aug 19, 2013 at 6:33 PM, Pavan Sudheendra > > >wrote: > > > > > > > Hi all, > > > > I'm getting the following error messages everytime i run the > map-reduce > > > > job across multiple hadoop clusters: > > > > > > > > java.lang.NullPointerException > > > > at org.apache.hadoop.hbase.util.Bytes.toBytes(Bytes.java:414) > > > > at org.apache.hadoop.hbase.client.HTable.(HTable.java:170) > > > > at com.company$AnalyzeMapper.contentidxjoin(MRjobt.java:153) > > > > > > > > > > > > Here's the code: > > > > > > > > public void map(ImmutableBytesWritable row, Result columns, Context > > > > context) > > > > throws IOException { > > > > ... > > > > ... > > > > public static String contentidxjoin(String contentId) { > > > > Configuration conf = HBaseConfiguration.create(); > > > > HTable table; > > > > try { > > > > table = new HTable(conf, ContentidxTable); > > > > if(table!= null) { > > > > Get get1 = new Get(Bytes.toBytes(contentId)); > > > > > get1.addColumn(Bytes.toBytes(ContentidxTable_ColumnFamily), > > > > Bytes.toBytes(ContentidxTable_ColumnQualifier)); > > > > Result result1 = table.get(get1); > > > > byte[] val1 = > > > > result1.getValue(Bytes.toBytes(ContentidxTable_ColumnFamily), > > > > Bytes.toBytes(ContentidxTable_ColumnQualifier)); > > > > if(val1!=null) { > > > > LOGGER.info("Fetched data from BARB-Content table"); > > > > } else { > > > > LOGGER.error("Error fetching data from BARB-Content > > > > table"); > > > > } > > > > return_value = > contentjoin(Bytes.toString(val1),contentId); > > > > } > > > > } > > > > catch (Exception e) { > > > > LOGGER.error("Error inside contentidxjoin method"); > > > > e.printStackTrace(); > > > > } > > > > return return_value; > > > > } > > > > } > > > > > > > > Assume all variables are defined. > > > > > > > > Can anyone please tell me why the table never gets instantiated or > > > > entered? I had set up break points and this function gets called many > > > times > > > > while mapper executes.. everytime it says *Error inside > contentidxjoin > > > > method*.. I'm 100% sure there are rows in the ContentidxTable so not > > sure > > > > why its not able to fetch the value from it.. > > > > > > > > Please help! > > > > > > > > > > > > -- > > > > Regards- > > > > Pavan > > > > > > > > > > > > > > > > -- > > > Regards- > > > Pavan > > > > > > > > > -- > Regards- > Pavan > --20cf301cbf821a0c3a04e44d8400--