Return-Path: Delivered-To: apmail-hadoop-pig-dev-archive@www.apache.org Received: (qmail 5807 invoked from network); 21 Apr 2010 01:28:14 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 21 Apr 2010 01:28:14 -0000 Received: (qmail 2399 invoked by uid 500); 21 Apr 2010 01:28:14 -0000 Delivered-To: apmail-hadoop-pig-dev-archive@hadoop.apache.org Received: (qmail 2320 invoked by uid 500); 21 Apr 2010 01:28:14 -0000 Mailing-List: contact pig-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: pig-dev@hadoop.apache.org Delivered-To: mailing list pig-dev@hadoop.apache.org Received: (qmail 2312 invoked by uid 99); 21 Apr 2010 01:28:14 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 21 Apr 2010 01:28:14 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.22] (HELO thor.apache.org) (140.211.11.22) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 21 Apr 2010 01:28:11 +0000 Received: from thor (localhost [127.0.0.1]) by thor.apache.org (8.13.8+Sun/8.13.8) with ESMTP id o3L1Rnxd020449 for ; Wed, 21 Apr 2010 01:27:50 GMT Message-ID: <3336834.102991271813269811.JavaMail.jira@thor> Date: Tue, 20 Apr 2010 21:27:49 -0400 (EDT) From: "Ashutosh Chauhan (JIRA)" To: pig-dev@hadoop.apache.org Subject: [jira] Commented: (PIG-798) Schema errors when using PigStorage and none when using BinStorage in FOREACH?? In-Reply-To: <1132343501.1241237010925.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/PIG-798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12859159#action_12859159 ] Ashutosh Chauhan commented on PIG-798: -------------------------------------- Viraj, I am confused with this description. It seems to me that you are first storing some data using BinStorage and then loading it using PigStorage. If that is so, obviously it will not work. PigStorage and BinStorage aren't interoperable in this way. Specifically, data stored using BinStorage, can only be loaded using BinStorage. > Schema errors when using PigStorage and none when using BinStorage in FOREACH?? > ------------------------------------------------------------------------------- > > Key: PIG-798 > URL: https://issues.apache.org/jira/browse/PIG-798 > Project: Pig > Issue Type: Bug > Components: impl > Affects Versions: 0.2.0 > Reporter: Viraj Bhat > Attachments: binstoragecreateop, schemaerr.pig, visits.txt > > > In the following script I have a tab separated text file, which I load using PigStorage() and store using BinStorage() > {code} > A = load '/user/viraj/visits.txt' using PigStorage() as (name:chararray, url:chararray, time:chararray); > B = group A by name; > store B into '/user/viraj/binstoragecreateop' using BinStorage(); > dump B; > {code} > I later load file 'binstoragecreateop' in the following way. > {code} > A = load '/user/viraj/binstoragecreateop' using BinStorage(); > B = foreach A generate $0 as name:chararray; > dump B; > {code} > Result > ======================================================================= > (Amy) > (Fred) > ======================================================================= > The above code work properly and returns the right results. If I use PigStorage() to achieve the same, I get the following error. > {code} > A = load '/user/viraj/visits.txt' using PigStorage(); > B = foreach A generate $0 as name:chararray; > dump B; > {code} > ======================================================================= > {code} > 2009-05-02 03:58:50,662 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1022: Type mismatch merging schema prefix. Field Schema: bytearray. Other Field Schema: name: chararray > Details at logfile: /home/viraj/pig-svn/trunk/pig_1241236728311.log > {code} > ======================================================================= > So why should the semantics of BinStorage() be different from PigStorage() where is ok not to specify a schema??? Should it not be consistent across both. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.