Return-Path: X-Original-To: apmail-hive-user-archive@www.apache.org Delivered-To: apmail-hive-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 6C8ED11DFE for ; Tue, 22 Jul 2014 17:35:46 +0000 (UTC) Received: (qmail 53886 invoked by uid 500); 22 Jul 2014 17:35:45 -0000 Delivered-To: apmail-hive-user-archive@hive.apache.org Received: (qmail 53813 invoked by uid 500); 22 Jul 2014 17:35:45 -0000 Mailing-List: contact user-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hive.apache.org Delivered-To: mailing list user@hive.apache.org Received: (qmail 53788 invoked by uid 99); 22 Jul 2014 17:35:45 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 22 Jul 2014 17:35:45 +0000 X-ASF-Spam-Status: No, hits=1.7 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of nitinpawar432@gmail.com designates 209.85.160.174 as permitted sender) Received: from [209.85.160.174] (HELO mail-yk0-f174.google.com) (209.85.160.174) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 22 Jul 2014 17:35:41 +0000 Received: by mail-yk0-f174.google.com with SMTP id q9so5099791ykb.5 for ; Tue, 22 Jul 2014 10:35:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=VgH6kYZNR92I8RBdHeVP7SB5Q+h4vuYfT+3kNpsb8tw=; b=IecB79qYL7foEuAOsI7bKH7CwrgNHBAtuSY3n7wWd40ZdzjvLZM/XFNhqqEIu9un5T jQP9xEeSbrQbsRfF4J0LmgzOScpJij8DwnVpAzylR9su4XVW7rOQJf1Mpn+w02+Ix7fo c2x1DNc8A6Id4OG77kfcg7TTKkuuiWUTSoKOQzbQ53+qvqGWEUtN5o3W50ZzSHTw0GtH UBsi9LFn2woGcAuZlxiyyCyRlzXzCiorCGTrYxxEl9Aj1ID7u8AcLSHuXpRmfAyjpkJw 5155PRkX+VH6CUlUmr5jDqf9XeKBt0xUs6WX6lnyXf5jKD5cAsWITQ4HR149BMv2Ap+K PelA== MIME-Version: 1.0 X-Received: by 10.236.142.242 with SMTP id i78mr53972166yhj.17.1406050520400; Tue, 22 Jul 2014 10:35:20 -0700 (PDT) Received: by 10.170.110.146 with HTTP; Tue, 22 Jul 2014 10:35:20 -0700 (PDT) In-Reply-To: References: Date: Tue, 22 Jul 2014 23:05:20 +0530 Message-ID: Subject: Re: Hive Statistics From: Nitin Pawar To: "user@hive.apache.org" Content-Type: multipart/alternative; boundary=20cf300e526bce0ffa04fecba51a X-Virus-Checked: Checked by ClamAV on apache.org --20cf300e526bce0ffa04fecba51a Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable by default hive stores the statistics in derby database. If you want a persistent look at column statistics, you may want to create mysql based database for column statistics. Your queries look fine On Tue, Jul 22, 2014 at 10:50 PM, Navdeep Agrawal < Navdeep_Agrawal@symantec.com> wrote: > Hi , > > > > i am trying to compute statistics on ORC File but i am unable see any > changes in PART_COL_STATS as well on using > > set hive.compute.query.using.stats=3Dtrue; > set hive.stats.reliable=3Dtrue; > set hive.stats.fetch.column.stats=3Dtrue; > set hive.stats.fetch.partition.stats=3Dtrue; > set hive.cbo.enable=3Dtrue; > > to get max value of a column it is running full Map reduce on column .. > what i want to use is max value stored in meta store ,but i am unable to > catch these statistics . > > my table desc is > load_inst_id int > src_filename string > server_date date > > my analyze query is > analyze table mytable partition(server_date=3D=E2=80=992013-11-30=E2=80= =B2) compute > statistics for columns load_inst_id; > > i am always getting 0 as loadinstant id ,i have to turn off my > hive.compute.query.using.stats to get correct result(through map reduce > max(load_inst_id)) > > > --=20 Nitin Pawar --20cf300e526bce0ffa04fecba51a Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
by default hive stores the statistics in derby database.= =C2=A0

If you want a persistent look at column statistic= s, you may want to create mysql based database for column statistics.=C2=A0=

Your queries look fine=C2=A0


On Tue, Jul 22, 2014 at 10:50 PM,= Navdeep Agrawal <Navdeep_Agrawal@symantec.com> w= rote:

Hi ,

=C2=A0

i am trying to compute statistics on ORC File but i am unable see any= changes in PART_COL_STATS as well on using

set hive.compute.query.using.stats=3Dtrue;
set hive.stats.reliable=3Dtrue;
set hive.stats.fetch.column.stats=3Dtrue= ;
set hive.stats.fetch.partition.stats=3Dtrue;
set hive.cbo.enable=3D= true;

to get max value of a column it is running full= Map reduce on column ..
what i want to use is max value stored in meta = store ,but i am unable to catch these statistics .

my table desc is
load_inst_id int
src_filename string
server_date date

my analyze query is
analyze table mytable pa= rtition(server_date=3D=E2=80=992013-11-30=E2=80=B2) compute statistics for = columns load_inst_id;

i am always getting 0 as loadinstant id ,i have to = turn off my hive.compute.query.using.stats to get correct result(through ma= p reduce max(load_inst_id))

=C2=A0




--
Nitin Pawar
--20cf300e526bce0ffa04fecba51a--