Return-Path: X-Original-To: apmail-hive-user-archive@www.apache.org Delivered-To: apmail-hive-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 5135A71C4 for ; Mon, 15 Aug 2011 08:07:04 +0000 (UTC) Received: (qmail 27733 invoked by uid 500); 15 Aug 2011 08:07:03 -0000 Delivered-To: apmail-hive-user-archive@hive.apache.org Received: (qmail 27175 invoked by uid 500); 15 Aug 2011 08:06:46 -0000 Mailing-List: contact user-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hive.apache.org Delivered-To: mailing list user@hive.apache.org Received: (qmail 27147 invoked by uid 500); 15 Aug 2011 08:06:41 -0000 Delivered-To: apmail-hadoop-hive-user@hadoop.apache.org Received: (qmail 26845 invoked by uid 99); 15 Aug 2011 08:06:39 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 15 Aug 2011 08:06:39 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [209.85.210.43] (HELO mail-pz0-f43.google.com) (209.85.210.43) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 15 Aug 2011 08:06:31 +0000 Received: by pzk1 with SMTP id 1so3946630pzk.30 for ; Mon, 15 Aug 2011 01:06:10 -0700 (PDT) MIME-Version: 1.0 Received: by 10.143.26.14 with SMTP id d14mr1827669wfj.42.1313395569714; Mon, 15 Aug 2011 01:06:09 -0700 (PDT) Received: by 10.68.49.6 with HTTP; Mon, 15 Aug 2011 01:06:09 -0700 (PDT) In-Reply-To: References: Date: Mon, 15 Aug 2011 16:06:09 +0800 Message-ID: Subject: Re: Setting up stats database From: wd To: hive-user@hadoop.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org HBase Publisher/Aggregator classes cannot be loaded. need to configure publisher/aggregator for hbase...there is only one way, that is use mysql .. does stats database will optimize hive query? Consider whether or not setup a mysql for this. On Mon, Aug 15, 2011 at 3:17 PM, wd wrote: > oh, found hive only support mysql and hbase. I'll try hbase. > > On Mon, Aug 15, 2011 at 3:09 PM, wd wrote: >> hi, >> >> I'm try to use postgres as stats database. And made following settings >> in hive-site.xml >> >> >> >> =A0hive.stats.dbclass >> =A0jdbc:postgresql >> =A0The default database that stores temporary hive >> statistics. >> >> >> >> =A0hive.stats.autogather >> =A0true >> =A0A flag to gather statistics automatically during the >> INSERT OVERWRITE command. >> >> >> >> =A0hive.stats.jdbcdriver >> =A0org.postgresql.Driver >> =A0The JDBC driver for the database that stores temporary >> hive statistics. >> >> >> >> =A0hive.stats.dbconnectionstring >> =A0jdbc:postgresql://localhost/hive_statsdb?createDatabaseIfNotEx= ist=3Dtrue;user=3Dhive;password=3Dpwd >> =A0The default connection string for the database that >> stores temporary hive statistics. >> >> >> I use postgres as hive meta database, so there is a >> postgresql-9.0-801.jdbc4.jar file in lib. >> >> After run 'analyse table t1 partitions(dt) comput statistics;' in hive >> cli, it will output some stats info in cli, but nothing in db. And I >> can found there is the flowing errors >> >> 1-08-15 14:54:54,767 INFO >> org.apache.hadoop.hive.ql.exec.TableScanOperator: Stats Gathering >> found a new partition spec =3D dt=3D20110805 >> 2011-08-15 14:54:54,767 INFO >> org.apache.hadoop.hive.ql.exec.TableScanOperator: 0 forwarding 1 rows >> 2011-08-15 14:54:54,767 INFO ExecMapper: ExecMapper: processing 1 >> rows: used memory =3D 39953640 >> 2011-08-15 14:54:54,768 INFO >> org.apache.hadoop.hive.ql.exec.MapOperator: 1 finished. closing... >> 2011-08-15 14:54:54,768 INFO >> org.apache.hadoop.hive.ql.exec.MapOperator: 1 forwarded 2 rows >> 2011-08-15 14:54:54,768 INFO >> org.apache.hadoop.hive.ql.exec.MapOperator: DESERIALIZE_ERRORS:0 >> 2011-08-15 14:54:54,768 INFO >> org.apache.hadoop.hive.ql.exec.TableScanOperator: 0 finished. >> closing... >> 2011-08-15 14:54:54,768 INFO >> org.apache.hadoop.hive.ql.exec.TableScanOperator: 0 forwarded 2 rows >> 2011-08-15 14:54:54,772 ERROR >> org.apache.hadoop.hive.ql.stats.jdbc.JDBCStatsPublisher: Error during >> JDBC connection to >> jdbc:postgresql://localhost/hive_statsdb?createDatabaseIfNotExist=3Dtrue= ;user=3Dhive;password=3Dpwd. >> java.lang.ClassNotFoundException: org.postgresql.Driver >> =A0 =A0 =A0 =A0at java.net.URLClassLoader$1.run(URLClassLoader.java:202) >> =A0 =A0 =A0 =A0at java.security.AccessController.doPrivileged(Native Met= hod) >> =A0 =A0 =A0 =A0at java.net.URLClassLoader.findClass(URLClassLoader.java:= 190) >> =A0 =A0 =A0 =A0at java.lang.ClassLoader.loadClass(ClassLoader.java:307) >> =A0 =A0 =A0 =A0at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.ja= va:301) >> =A0 =A0 =A0 =A0at java.lang.ClassLoader.loadClass(ClassLoader.java:248) >> =A0 =A0 =A0 =A0at java.lang.Class.forName0(Native Method) >> =A0 =A0 =A0 =A0at java.lang.Class.forName(Class.java:169) >> =A0 =A0 =A0 =A0at org.apache.hadoop.hive.ql.stats.jdbc.JDBCStatsPublishe= r.connect(JDBCStatsPublisher.java:55) >> =A0 =A0 =A0 =A0at org.apache.hadoop.hive.ql.exec.TableScanOperator.publi= shStats(TableScanOperator.java:202) >> =A0 =A0 =A0 =A0at org.apache.hadoop.hive.ql.exec.TableScanOperator.close= Op(TableScanOperator.java:164) >> =A0 =A0 =A0 =A0at org.apache.hadoop.hive.ql.exec.Operator.close(Operator= .java:557) >> =A0 =A0 =A0 =A0at org.apache.hadoop.hive.ql.exec.Operator.close(Operator= .java:566) >> =A0 =A0 =A0 =A0at org.apache.hadoop.hive.ql.exec.ExecMapper.close(ExecMa= pper.java:193) >> =A0 =A0 =A0 =A0at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:= 57) >> =A0 =A0 =A0 =A0at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.= java:358) >> =A0 =A0 =A0 =A0at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) >> =A0 =A0 =A0 =A0at org.apache.hadoop.mapred.Child.main(Child.java:170) >> 2011-08-15 14:54:54,774 INFO >> org.apache.hadoop.hive.ql.exec.TableScanOperator: StatsPublishing >> error: cannot connect to database. >> 2011-08-15 14:54:54,774 INFO >> org.apache.hadoop.hive.ql.exec.MapOperator: 1 Close done >> >