Return-Path: Delivered-To: apmail-hadoop-hive-dev-archive@minotaur.apache.org Received: (qmail 48027 invoked from network); 1 Oct 2010 09:51:03 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 1 Oct 2010 09:51:03 -0000 Received: (qmail 3577 invoked by uid 500); 1 Oct 2010 09:51:03 -0000 Delivered-To: apmail-hadoop-hive-dev-archive@hadoop.apache.org Received: (qmail 3269 invoked by uid 500); 1 Oct 2010 09:51:00 -0000 Mailing-List: contact hive-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hive-dev@hadoop.apache.org Delivered-To: mailing list hive-dev@hadoop.apache.org Received: (qmail 3261 invoked by uid 99); 1 Oct 2010 09:50:59 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 01 Oct 2010 09:50:59 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.22] (HELO thor.apache.org) (140.211.11.22) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 01 Oct 2010 09:50:57 +0000 Received: from thor (localhost [127.0.0.1]) by thor.apache.org (8.13.8+Sun/8.13.8) with ESMTP id o919oZYJ017408 for ; Fri, 1 Oct 2010 09:50:35 GMT Message-ID: <2743377.494671285926635231.JavaMail.jira@thor> Date: Fri, 1 Oct 2010 05:50:35 -0400 (EDT) From: "Thiruvel Thirumoolan (JIRA)" To: hive-dev@hadoop.apache.org Subject: [jira] Commented: (HIVE-1452) Mapside join on non partitioned table with partitioned table causes error In-Reply-To: <9831853.236171278489951540.JavaMail.jira@thor> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HIVE-1452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12916842#action_12916842 ] Thiruvel Thirumoolan commented on HIVE-1452: -------------------------------------------- > However, I see different results when MAPJOIN is used. Will open another JIRA for the same. Have opened HIVE-1682 for the same. > Mapside join on non partitioned table with partitioned table causes error > ------------------------------------------------------------------------- > > Key: HIVE-1452 > URL: https://issues.apache.org/jira/browse/HIVE-1452 > Project: Hadoop Hive > Issue Type: Bug > Components: CLI > Affects Versions: 0.6.0 > Reporter: Viraj Bhat > Assignee: Thiruvel Thirumoolan > > I am running script which contains two tables, one is dynamically partitioned and stored as RCFormat and the other is stored as TXT file. > The TXT file has around 397MB in size and has around 24million rows. > {code} > drop table joinquery; > create external table joinquery ( > id string, > type string, > sec string, > num string, > url string, > cost string, > listinfo array > > ) > STORED AS TEXTFILE > LOCATION '/projects/joinquery'; > CREATE EXTERNAL TABLE idtable20mil( > id string > ) > STORED AS TEXTFILE > LOCATION '/projects/idtable20mil'; > insert overwrite table joinquery > select > /*+ MAPJOIN(idtable20mil) */ > rctable.id, > rctable.type, > rctable.map['sec'], > rctable.map['num'], > rctable.map['url'], > rctable.map['cost'], > rctable.listinfo > from rctable > JOIN idtable20mil on (rctable.id = idtable20mil.id) > where > rctable.id is not null and > rctable.part='value' and > rctable.subpart='value'and > rctable.pty='100' and > rctable.uniqid='1000' > order by id; > {code} > Result: > Possible error: > Data file split:string,part:string,subpart:string,subsubpart:string> is corrupted. > Solution: > Replace file. i.e. by re-running the query that produced the source table / partition. > ----- > If I look at mapper logs. > {verbatim} > Caused by: java.io.IOException: java.io.EOFException > at org.apache.hadoop.hive.ql.exec.persistence.MapJoinObjectValue.readExternal(MapJoinObjectValue.java:109) > at java.io.ObjectInputStream.readExternalData(ObjectInputStream.java:1792) > at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1751) > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329) > at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351) > at org.apache.hadoop.hive.ql.util.jdbm.htree.HashBucket.readExternal(HashBucket.java:284) > at java.io.ObjectInputStream.readExternalData(ObjectInputStream.java:1792) > at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1751) > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329) > at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351) > at org.apache.hadoop.hive.ql.util.jdbm.helper.Serialization.deserialize(Serialization.java:106) > at org.apache.hadoop.hive.ql.util.jdbm.helper.DefaultSerializer.deserialize(DefaultSerializer.java:106) > at org.apache.hadoop.hive.ql.util.jdbm.recman.BaseRecordManager.fetch(BaseRecordManager.java:360) > at org.apache.hadoop.hive.ql.util.jdbm.recman.BaseRecordManager.fetch(BaseRecordManager.java:332) > at org.apache.hadoop.hive.ql.util.jdbm.htree.HashDirectory.get(HashDirectory.java:195) > at org.apache.hadoop.hive.ql.util.jdbm.htree.HTree.get(HTree.java:155) > at org.apache.hadoop.hive.ql.exec.persistence.HashMapWrapper.get(HashMapWrapper.java:114) > ... 11 more > Caused by: java.io.EOFException > at java.io.DataInputStream.readInt(DataInputStream.java:375) > at java.io.ObjectInputStream$BlockDataInputStream.readInt(ObjectInputStream.java:2776) > at java.io.ObjectInputStream.readInt(ObjectInputStream.java:950) > at org.apache.hadoop.io.BytesWritable.readFields(BytesWritable.java:153) > at org.apache.hadoop.hive.ql.exec.persistence.MapJoinObjectValue.readExternal(MapJoinObjectValue.java:98) > {verbatim} > I am trying to create a testcase, which can demonstrate this error. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.