Return-Path: X-Original-To: apmail-hive-user-archive@www.apache.org Delivered-To: apmail-hive-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 1F8FF18819 for ; Thu, 10 Mar 2016 23:13:56 +0000 (UTC) Received: (qmail 13014 invoked by uid 500); 10 Mar 2016 23:11:19 -0000 Delivered-To: apmail-hive-user-archive@hive.apache.org Received: (qmail 12933 invoked by uid 500); 10 Mar 2016 23:11:19 -0000 Mailing-List: contact user-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hive.apache.org Delivered-To: mailing list user@hive.apache.org Received: (qmail 12895 invoked by uid 99); 10 Mar 2016 23:11:19 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 10 Mar 2016 23:11:19 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id D13FCC003D for ; Thu, 10 Mar 2016 23:11:18 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.798 X-Spam-Level: ** X-Spam-Status: No, score=2.798 tagged_above=-999 required=6.31 tests=[FSL_HELO_BARE_IP_2=1.499, HTML_MESSAGE=2, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id ieQHCNWOtxfV for ; Thu, 10 Mar 2016 23:11:17 +0000 (UTC) Received: from relayvx12c.securemail.intermedia.net (relayvx12c.securemail.intermedia.net [64.78.52.187]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 942975F640 for ; Thu, 10 Mar 2016 23:11:16 +0000 (UTC) Received: from securemail.intermedia.net (localhost [127.0.0.1]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by emg-ca-1-2.localdomain (Postfix) with ESMTPS id B37D953E65 for ; Thu, 10 Mar 2016 15:11:14 -0800 (PST) Subject: Re: Hive Cli ORC table read error with limit option MIME-Version: 1.0 x-echoworx-msg-id: 1c01b5c0-04d0-4f02-b0ba-032cdb702d13 x-echoworx-emg-received: Thu, 10 Mar 2016 15:11:06.071 -0800 x-echoworx-message-code-hashed: da3aeded2da272b50d9ddb24eb8f8396dfdd2e67b5aed13db5abbc26eac2cf5b x-echoworx-action: delivered Received: from 10.254.155.17 ([10.254.155.17]) by emg-ca-1-2 (JAMES SMTP Server 2.3.2) with SMTP ID 426 for ; Thu, 10 Mar 2016 15:11:06 -0800 (PST) Received: from MBX080-W3-CO-1.exch080.serverpod.net (unknown [10.224.117.52]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by emg-ca-1-2.localdomain (Postfix) with ESMTPS id D74DD53E65 for ; Thu, 10 Mar 2016 15:11:05 -0800 (PST) Received: from MBX080-W3-CO-2.exch080.serverpod.net (10.224.117.53) by MBX080-W3-CO-1.exch080.serverpod.net (10.224.117.52) with Microsoft SMTP Server (TLS) id 15.0.1130.7; Thu, 10 Mar 2016 15:11:05 -0800 Received: from MBX080-W3-CO-2.exch080.serverpod.net ([10.224.117.53]) by mbx080-w3-co-2.exch080.serverpod.net ([10.224.117.53]) with mapi id 15.00.1130.005; Thu, 10 Mar 2016 15:11:04 -0800 From: Prasanth Jayachandran To: "user@hive.apache.org" Thread-Topic: Hive Cli ORC table read error with limit option Thread-Index: AQHRc3cStKvje9EHMU6pG51FWncgZ59ElNaAgAAIa4CAAAM/AIAAAdKAgAAXuICAADJPAIAEveuAgACCpACAA7QiAIABKosAgACT1QCABEP+AA== Date: Thu, 10 Mar 2016 23:11:03 +0000 Message-ID: <43B7F843-E70F-42BD-A15A-9AF684F235ED@hortonworks.com> References: In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-exchange-messagesentrepresentingtype: 1 x-ms-exchange-transport-fromentityheader: Hosted x-originating-ip: [192.208.55.186] x-source-routing-agent: Processed Content-Type: multipart/alternative; boundary="_000_43B7F843E70F42BDA15A9AF684F235EDhortonworkscom_" --_000_43B7F843E70F42BDA15A9AF684F235EDhortonworkscom_ Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Could you attach the emtpy orc files from one of the broken partition somew= here? I can run some tests on it to see why its happening. Thanks Prasanth On Mar 8, 2016, at 12:02 AM, Biswajit Nayak > wrote: Both the parameters are set to false by default. hive> set hive.optimize.index.filter; hive.optimize.index.filter=3Dfalse hive> set hive.orc.splits.include.file.footer; hive.orc.splits.include.file.footer=3Dfalse hive> >>>I suspect this might be related to having 0 row files in the buckets not having any recorded schema. yes there are few files with 0 row, but the query works with other partitio= n (which has 0 row files). Out of 30 partition (for a month), 3-4 partition= are having this issue. Even reload of the data does not yield anything. Qu= ery works fine in MR now, but having issue in tez. On Tue, Mar 8, 2016 at 2:43 AM, Gopal Vijayaraghavan > wrote: > c varchar(2) ... > Num Buckets: 7 I suspect this might be related to having 0 row files in the buckets not having any recorded schema. You can also experiment with hive.optimize.index.filter=3Dfalse, to see if the zero row case is artificially produced via predicate push-down. That shouldn't be a problem unless you've turned on hive.orc.splits.include.file.footer=3Dtrue (recommended to be false). Your row-locations don't actually match any Apache source jar in my builds, are there any other patches to consider? Cheers, Gopal --_000_43B7F843E70F42BDA15A9AF684F235EDhortonworkscom_ Content-Type: text/html; charset="us-ascii" Content-ID: <0D491C972FB3A84CAFBB801469DDAC0F@exch080.serverpod.net> Content-Transfer-Encoding: quoted-printable Could you attach the emtpy orc files from one of the broken partition somew= here? I can run some tests on it to see why its happening.

Thanks
Prasanth

On Mar 8, 2016, at 12:02 AM, Biswajit Nayak <biswajit@altiscale.com> wr= ote:

Both the parameters are set to false by default= . 

hive> set hive.optimize.index.filter;
hive.optimize.index.filter=3Dfalse
hive> set hive.orc.splits.include.file.footer;
hive.orc.splits.include.file.footer=3Dfalse
hive> 

>>>I sus= pect this might be related to having 0 row files in the buckets not<= /div> having any recorded schema.

yes there are few files with 0 row, but the query works wit= h other partition (which has 0 row files). Out of 30 partition (for a month= ), 3-4 partition are having this issue. Even reload of the data does not yi= eld anything. Query works fine in MR now, but having issue in tez. 



On Tue, Mar 8, 2016 at 2:43 AM, Gopal Vijayaragh= avan <gopal= v@apache.org> wrote:

> c                varchar(2) ...
> Num Buckets:         7

I suspect this might be related to having 0 row files in the buckets not having any recorded schema.

You can also experiment with hive.optimize.index.filter=3Dfalse, to see if<= br class=3D""> the zero row case is artificially produced via predicate push-down.


That shouldn't be a problem unless you've turned on
hive.orc.splits.include.file.footer=3Dtrue (recommended to be false).

Your row-locations don't actually match any Apache source jar in my
builds, are there any other patches to consider?

Cheers,
Gopal




--_000_43B7F843E70F42BDA15A9AF684F235EDhortonworkscom_--