Return-Path: X-Original-To: apmail-hive-user-archive@www.apache.org Delivered-To: apmail-hive-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 5F2B6DFDF for ; Fri, 14 Sep 2012 10:19:16 +0000 (UTC) Received: (qmail 3637 invoked by uid 500); 14 Sep 2012 10:19:15 -0000 Delivered-To: apmail-hive-user-archive@hive.apache.org Received: (qmail 3381 invoked by uid 500); 14 Sep 2012 10:19:09 -0000 Mailing-List: contact user-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hive.apache.org Delivered-To: mailing list user@hive.apache.org Received: (qmail 3355 invoked by uid 500); 14 Sep 2012 10:19:08 -0000 Delivered-To: apmail-hadoop-hive-user@hadoop.apache.org Received: (qmail 3350 invoked by uid 99); 14 Sep 2012 10:19:08 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 14 Sep 2012 10:19:08 +0000 X-ASF-Spam-Status: No, hits=1.8 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,FSL_RCVD_USER,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of bharathvissapragada1990@gmail.com designates 209.85.210.48 as permitted sender) Received: from [209.85.210.48] (HELO mail-pz0-f48.google.com) (209.85.210.48) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 14 Sep 2012 10:19:03 +0000 Received: by dadz8 with SMTP id z8so366304dad.35 for ; Fri, 14 Sep 2012 03:18:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:from:date:message-id:subject:to:content-type; bh=HKuLGoykaBHNKKxgAF/zqwP7B+MkEKNYX4Fy5QJnrBg=; b=Xj9ooz0V9SdC/MorOR0g12d+bUddtIRSOsmLrXfaaA8vPem8ZH+Pkq0Qn1on5qPGyC fhs08XXi1MbR3aXgn+4km8FZWI6LUdxHDdAPkdytOsspiUlkTwJmK2h+Ge7QWQUsYn0P pXcdzry8RrvL9YNoU8rJ8Lvxf4gyq7TwN+9CX5Sp+nc+aj+Y2bzW40CkoejHAxDbeKo/ vGIEUlfL19HJWER2RgQrbCIO8t92frjF+JyDrVzs3YZN+K2pkSdIrv3o7f43fNJwnCoE 2GG69Vsmo672jXRyRh+3vqWuRb6ysnt/k5j5wUq2Q7o3t2+lCYaG+6tbV/jkSi4YnksK BBEQ== Received: by 10.68.222.167 with SMTP id qn7mr4533460pbc.98.1347617923036; Fri, 14 Sep 2012 03:18:43 -0700 (PDT) MIME-Version: 1.0 Received: by 10.66.77.105 with HTTP; Fri, 14 Sep 2012 03:18:22 -0700 (PDT) From: bharath vissapragada Date: Fri, 14 Sep 2012 15:48:22 +0530 Message-ID: Subject: Running TPCH workload on Hive To: hive-user@hadoop.apache.org Content-Type: multipart/alternative; boundary=047d7b2ed9e998760704c9a6bff6 X-Virus-Checked: Checked by ClamAV on apache.org --047d7b2ed9e998760704c9a6bff6 Content-Type: text/plain; charset=ISO-8859-1 Hi folks, Iam trying to run TPC-H workload on Hive (Hive-600). However Iam facing problems with configuration. The queries are taking insanely long time. I ran Q21 on a TPCH workload of SF 100 (same dataset on which the experiments in that doc were run) on a cluster of 8 datanodes+TT and 1 NN. My datanode config is as follows 2 dual core CPU (total 4 threads in parallel) 3.8GB main memory per node configured 4 Maps and 4 reducers per node . I've set hive-reducers max to 32 (total reduce slots in hadoop cluster) instead of letting hive decide it. My Q21 has been running for 12 hrs for now compared to 2500 seconds that was mentioned in the results . I wonder what is so terribly wrong with my config. Some of my reducers take insanely long time (6hrs sometime) and others take 2hrs (even this is more compared to the overall run time of 2500secs of same query as in the results). Can someone help me with this? Is the data partitioned or something (in the experiments)? -- Regards, Bharath .V w:http://researchweb.iiit.ac.in/~bharath.v --047d7b2ed9e998760704c9a6bff6 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Hi folks,

Iam trying to run TPC-H wor= kload on Hive (Hive-600). However Iam facing problems with configuration. T= he queries are taking insanely long time.

I ran Q21 on a TPCH workload of SF 100 (same dataset on which =A0the experi= ments in that doc were run) on a cluster of 8 datanodes+TT and 1 NN. My dat= anode config is as follows

2 dual core CPU (total 4 th= reads in parallel)
3.8GB main memory per node

<= /div>
configured 4 Maps and 4 reducers per node . I've set hive-reducers max = to 32 (total reduce slots in hadoop cluster) instead of letting hive decide= it.

My Q21 has been running for= 12 hrs for now compared to 2500 seconds that was mentioned in the results = . I wonder what is so terribly wrong with my config. Some of my reducers ta= ke insanely long time (6hrs sometime) and others take 2hrs (even this is mo= re compared to the overall run time of 2500secs of same query as in the res= ults).

Can someone help me with this? Is the data partitioned or something (in the= experiments)? =A0
--
Regards,
Bharath .V
w:http://researchweb= .iiit.ac.in/~bharath.v
--047d7b2ed9e998760704c9a6bff6--