Return-Path: X-Original-To: apmail-hive-user-archive@www.apache.org Delivered-To: apmail-hive-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 22697FC34 for ; Fri, 22 Mar 2013 23:02:54 +0000 (UTC) Received: (qmail 29676 invoked by uid 500); 22 Mar 2013 23:02:52 -0000 Delivered-To: apmail-hive-user-archive@hive.apache.org Received: (qmail 29615 invoked by uid 500); 22 Mar 2013 23:02:52 -0000 Mailing-List: contact user-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hive.apache.org Delivered-To: mailing list user@hive.apache.org Received: (qmail 29607 invoked by uid 99); 22 Mar 2013 23:02:52 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 22 Mar 2013 23:02:52 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of kwiley@keithwiley.com designates 69.93.243.6 as permitted sender) Received: from [69.93.243.6] (HELO gateway12.websitewelcome.com) (69.93.243.6) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 22 Mar 2013 23:02:46 +0000 Received: by gateway12.websitewelcome.com (Postfix, from userid 5007) id F0ECD877CF962; Fri, 22 Mar 2013 18:02:25 -0500 (CDT) Received: from gator542.hostgator.com (gator542.hostgator.com [74.54.187.114]) by gateway12.websitewelcome.com (Postfix) with ESMTP id E2D1C877CF93E for ; Fri, 22 Mar 2013 18:02:25 -0500 (CDT) Received: from [24.19.6.8] (port=44083 helo=[192.168.10.2]) by gator542.hostgator.com with esmtpa (Exim 4.80) (envelope-from ) id 1UJAyf-0001n8-P8 for user@hive.apache.org; Fri, 22 Mar 2013 18:02:25 -0500 From: Keith Wiley Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Subject: Query crawls through reducer Date: Fri, 22 Mar 2013 16:02:24 -0700 Message-Id: <098EF9B7-D25E-443F-A937-6E2C5B633061@keithwiley.com> To: user@hive.apache.org Mime-Version: 1.0 (Apple Message framework v1085) X-Mailer: Apple Mail (2.1085) X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - gator542.hostgator.com X-AntiAbuse: Original Domain - hive.apache.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - keithwiley.com X-BWhitelist: no X-Source: X-Source-Args: X-Source-Dir: X-Source-Sender: ([192.168.10.2]) [24.19.6.8]:44083 X-Source-Auth: kwiley+keithwiley.com X-Email-Count: 1 X-Source-Cap: a2J3aWxleTtrYndpbGV5O2dhdG9yNTQyLmhvc3RnYXRvci5jb20= X-Virus-Checked: Checked by ClamAV on apache.org The following query translates into a many-map-single-reduce job (which = is common) and also slags through the reduce stage...it's killing the = overall query: select * from a where b >=3D 'c' order by b desc limit 100 Note that b is a partition. What component is making the reducer heavy? = Is it the order by or the limit (I'm sure it's not the = partition-specific where clause, right?)? Are there ways to improve its = performance? = __________________________________________________________________________= ______ Keith Wiley kwiley@keithwiley.com keithwiley.com = music.keithwiley.com "You can scratch an itch, but you can't itch a scratch. Furthermore, an = itch can itch but a scratch can't scratch. Finally, a scratch can itch, but an = itch can't scratch. All together this implies: He scratched the itch from the = scratch that itched but would never itch the scratch from the itch that scratched." -- Keith Wiley = __________________________________________________________________________= ______