Return-Path: Delivered-To: apmail-hive-user-archive@www.apache.org Received: (qmail 47519 invoked from network); 16 Mar 2011 17:00:55 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 16 Mar 2011 17:00:55 -0000 Received: (qmail 79551 invoked by uid 500); 16 Mar 2011 17:00:54 -0000 Delivered-To: apmail-hive-user-archive@hive.apache.org Received: (qmail 79502 invoked by uid 500); 16 Mar 2011 17:00:54 -0000 Mailing-List: contact user-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hive.apache.org Delivered-To: mailing list user@hive.apache.org Received: (qmail 79494 invoked by uid 99); 16 Mar 2011 17:00:54 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 16 Mar 2011 17:00:54 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of edlinuxguru@gmail.com designates 209.85.161.48 as permitted sender) Received: from [209.85.161.48] (HELO mail-fx0-f48.google.com) (209.85.161.48) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 16 Mar 2011 17:00:48 +0000 Received: by fxm7 with SMTP id 7so2370338fxm.35 for ; Wed, 16 Mar 2011 10:00:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=aKwLlB2bt0XNPE6CLH016sfY/p79FEYSHN60UHi9wuQ=; b=roh2FUtkFIv8zlQD1FiR7vVtq8MQQ4kAkzXu2PaksZC0OVKkLIzxMBIcduvST9mhJM mFq/iCA95Wz4trs4mAKscF3HbjJcwiRebKpnDWifte4FpkwMSLfk8hrqr1EExzJ5upBe DYOfuSx4EPVtzI9MuzhsFajXO6t3rFzEWUUBg= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=PLVBOPTLVaCb/f7j8jmTDaL0bjYIWDDgWtKNqEbgeWAIXvLrWqq59mITUeSssAqE83 SLZ6sJRBBAHgkweGJXjI32i9bFfbKnQsWroq7qpoQN7WWwn03yOFy1ETu5qw7jt/5JPz dXniSqqIQJ2cQyDLphj6LBlVMNiYbVhGC+FFQ= MIME-Version: 1.0 Received: by 10.223.83.12 with SMTP id d12mr253727fal.51.1300294786318; Wed, 16 Mar 2011 09:59:46 -0700 (PDT) Received: by 10.223.125.211 with HTTP; Wed, 16 Mar 2011 09:59:46 -0700 (PDT) In-Reply-To: References: Date: Wed, 16 Mar 2011 12:59:46 -0400 Message-ID: Subject: Re: Problem with Hive HBase Integration - Running Mapper task From: Edward Capriolo To: user@hive.apache.org Cc: Abhijit Sharma Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable On Wed, Mar 16, 2011 at 12:51 PM, Abhijit Sharma wrote: > Hi, > I am trying to connect the hive shell running on my laptop to a remote > hadoop / hbase cluster and test out the HBase/Hive integration. I manage = to > connect and create the table in hbase from remote Hive shell. I am also > passing the auxpath parameter to the shell (specifying the Hive/HBase > integration related jars). In addition I have copied over these files to > HDFS as well (I am using the user name hadoop - so the jars are stored in > HDFS under /user/hadoop). > However when =A0I fire a query on the HBase table - select * from h1 wher= e > key=3D12; - the map reduce job launches but the map task fails with the > following error: > ---- > > java.io.IOException: Cannot create an instance of InputSplit class =3D > org.apache.hadoop.hive.hbase.HBaseSplit:org.apache.hadoop.hive.hbase.HBas= eSplit > at > org.apache.hadoop.hive.ql.io.HiveInputFormat$HiveInputSplit.readFields(Hi= veInputFormat.java:143) > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:333) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) > at org.apache.hadoop.mapred.Child.main(Child.java:170) > > ---- > This basically indicates that the Mapper task is unable to locate the > Hive/HBase storage handler that it requires when running. This happens ev= en > though this has been specified in the auxpath and uploaded to HDFS. > Any ideas/pointers/debug options on what I might be doing wrong? Any help= is > much appreciated. > p.s. the exploded jars do get copied too under the taskTracker directory = on > the cluster node > Thanks I have seen this error. This is oddness between hadoop,hive, and map/reduce classpaths. This is what I do mkdir hive_home/auxlib cp all hive and hbase jars here. Also copy the hbase handler jar to auxlib. Auxlib get pushed out by the distributed cache each job and you do not need to use ADD_JAR XXXX; But that is not enough! DOH! Planning the job and getting the splits happen before the map tasks are launched. For this i drop all the hbase libs in hadoop_home/lib only on the machine that is launching the job. You can fiddle around with HADOOP_CLASSPATH and achieve similar results. Good luck.