Return-Path: X-Original-To: apmail-hive-user-archive@www.apache.org Delivered-To: apmail-hive-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id B5CAFF709 for ; Wed, 29 May 2013 21:08:18 +0000 (UTC) Received: (qmail 51651 invoked by uid 500); 29 May 2013 21:08:17 -0000 Delivered-To: apmail-hive-user-archive@hive.apache.org Received: (qmail 51608 invoked by uid 500); 29 May 2013 21:08:17 -0000 Mailing-List: contact user-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hive.apache.org Delivered-To: mailing list user@hive.apache.org Received: (qmail 51599 invoked by uid 99); 29 May 2013 21:08:17 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 29 May 2013 21:08:17 +0000 X-ASF-Spam-Status: No, hits=0.0 required=5.0 tests=RCVD_IN_DNSWL_NONE X-Spam-Check-By: apache.org Received-SPF: error (nike.apache.org: local policy) Received: from [194.90.6.2] (HELO mxout7.netvision.net.il) (194.90.6.2) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 29 May 2013 21:08:10 +0000 MIME-version: 1.0 Content-transfer-encoding: 7BIT Content-type: text/plain; CHARSET=US-ASCII Received: from mail-srv.pursway.com ([82.166.62.163]) by mxout7.netvision.net.il (Oracle Communications Messaging Server 7u4-24.01(7.0.4.24.0) 64bit (built Nov 17 2011)) with ESMTPSA id <0MNK007A8W0HUCA0@mxout7.netvision.net.il> for user@hive.apache.org; Thu, 30 May 2013 00:07:29 +0300 (IDT) From: Gabi Kazav To: "user@hive.apache.org" Subject: RE: Hive - max rows limit (int limit = 2^31). need Help (looks liek a bug) Thread-topic: Hive - max rows limit (int limit = 2^31). need Help (looks liek a bug) Thread-index: AQHOXLBVPgxa1hHjOkC1CAZm9iXdHZkcp3KQ Date: Wed, 29 May 2013 21:07:28 +0000 Message-id: References: In-reply-to: Accept-Language: en-US Content-language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: X-Virus-Checked: Checked by ClamAV on apache.org Thanks for helping. Here is some more data: create table max_sint_rows (s1 string) partitioned by (p1 string) ROW FORMAT DELIMITED LINES TERMINATED BY '\n'; Create table small_table (p1 string) ROW FORMAT DELIMITED LINES TERMINATED BY '\n'; alter table max_sint_rows add partition (p1="1"); -----Original Message----- From: John Meagher [mailto:john.meagher@gmail.com] Sent: Thursday, May 30, 2013 12:06 AM To: user@hive.apache.org Subject: Re: Hive - max rows limit (int limit = 2^31). need Help (looks liek a bug) What is the data type of the p1 column? I've used hive with partitions containing far above 2 billion rows without having any problems like this. On Wed, May 29, 2013 at 2:41 PM, Gabi Kazav wrote: > Hi, > > > > We are working on hive DB with our Hadoop cluster. > > We now facing an issue about joining a big partition with more than > 2^31 rows. > > When the partition has more than 2147483648 rows (even 2147483649) the > output of the join is a single row. > When the partition has less than 2147483648 rows (event 2147483647) > the output is correct. > > Our test case: > > create a table with 2147483649 rows in a partition with the value : > "1" , join this table to another table with a single row,single column > with the value "1" on the partition_key. > later delete 2 rows and run the same join. > 1st : only a single row is created > 2nd : 2147483647 rows > > the query we run for test the case is: > > > > create table output_rows_over as > > select a.s1 > > from max_sint_rows a join small_table b > > on (a.p1=b.p1); > > > > on more than 2^31 rows we got the following on reducer log: > > 2013-05-27 21:51:14,186 INFO > org.apache.hadoop.hive.ql.exec.FileSinkOperator: TABLE_ID_1_ROWCOUNT:1 > > On less than 2^31 rows we got the following reducer log: > > 2013-05-27 23:43:14,681 INFO > org.apache.hadoop.hive.ql.exec.FileSinkOperator: > TABLE_ID_1_ROWCOUNT:2147483647 > > > > > > Anyone faced this issue? > > Does hive has workaround for that? > > I have huge partitions I need to work on and I cannot use hive for that.. > > > > Thanks, > > > > > > Gabi Kazav > > Infrastructure Team Leader, Pursway.com > > > > ************************************************************************************ This footnote confirms that this email message has been scanned by PineApp Mail-SeCure for the presence of malicious code, vandals & computer viruses. ************************************************************************************